POSTED 4/7/02: Below I've made available a handful of papers that describe some methods we might want to try applying to a variety of market prediction and trading problems. With the exception of the last paper, all of them fall into the first, "supervised learning" category of the informal presentation I made on Friday. These may not even be the best papers on these topics, but I thought they would provide a reasonable starting point for more specific discussions. Also, if anyone wants me to post other non-proprietary material here, just send me postscript (preferred) or PDF, or links to the material.

Here's a survey paper on the family of algorithms broadly known as boosting methods.
[Postscript] [PDF]

This paper describes a particular variant of boosting developed for document classification. In such applications, there is a huge number of potential attributes or factors from which one may want to do prediction, but most of them will be irrelevant to any particular example.
[Postscript] [PDF]

This one describes an application of boosting-like methods to a simulated auction scenario. I haven't read this one myself yet, but thought it might hit closer to home.
[Postscript] [PDF]

Here's a paper on so-called multiplicative update methods, which also often work well with huge numbers of features...
[Postscript] [PDF]

...and here is an application of such methods to portfolio selection.
[Postscript] [PDF]

Here is a tutorial on Support Vector Machines (SVMs), a different approach to problems with many features/factors. Only have it in PDF, and haven't read it myself yet.
[PDF]

This one is a bit of an outlier, but it's an attempted application to the later stuff I talked about (trying to use Markov decision processes to model one's own effects on the environment/market) to automated market-making.
[Postscript] [PDF]

POSTED 4/15/02: The following two papers were posted following a conversation with Alex on different methods of limiting complexity to avoid overfitting when learning a probability distribution from sample data.

This one emphasizes complexity regularization in classification problems, but every aspect has an analogue in the distribution-learning setting.
[Postscript] [PDF]

This one develops the tools needed to develop complexity regularization methods for distribution learning.
[Postscript] [PDF]

POSTED 4/15/02: Some ECN data files:
QQQ QQQ2 YHOO IBM

POSTED 6/5/02: Papers related to conversation with Jeff W. and Michael B. on gradient methods for searching in a parametric stratgegy space

[BaxterBartlett, Postscript] [BaxterBartlett, PDF]

[BaxterWeaverBartlett, Postscript] [BaxterWeaverBartlett, PDF]

[BaxterBartlett2, Postscript] [BaxterBartlett2, PDF]

[Sutton et al., Postscript] [Sutton et al., PDF]

[Kearns et al., Postscript] [Kearns et al., PDF] (Mainly appendix B)


www.digits.com