 |
2004 . 2003 . 2002 . 2001 . 2000 . 1999 . 1998 . 1997 . 1996 . 1995 . 1994 . 1993 . 1992 . 1991 . 1990
MS-CIS-02-01
Alin Deutsch and Val Tannen
This paper examines some of the issues that arise in the
process of XML publishing of mixed-storage proprietary data.
We argue that such data will reside typically in RDBMS's and/or
LDAP, etc, augmented with a set of native XML documents. An
additional challenge is to take advantage of redundancy in
the storage schema, such as mixed materialized views that are
stored for the purpose of enhancing performance.
We argue that such systems need to take into consideration
mappings in both directions between the proprietary schema
and the published schema. Thus, reformulating queries on the
(published) XML schema into executable queries on the stored
data will require the effect of both composition-with-views
(as in SilfRoute and XPERANTO) and rewriting-with-views (as
in the Information Manifold and Agora).
Using any of the simple encodings of relational data as XML,
the mappings between schemas and the materialized views can
be expressed in XQery, just like the queries on the published
schema. For query reformulation we give an algorithm that uses
logical assertions to capture formally the semantics of a large
part of XQuery. We also give a completeness theorem for our
reformulation algorithm. The algorithm was implemented in an
XML query rewriting system and we present a suite of experiments
that validate this technique.
MS-CIS-02-02
Franjo Ivancic
The DARPA MoBIES Automotive Vehicle-Vehicle Open Experimental
Platform [14] defines a longitudinal controller for the leader
car of a platoon moving in an Intelligent Vehicle Highway System
(IVHS) autonomously. The challenge is to verify that cars using
this longitudinal controller provide a safe (that is, collision-free)
ride. This report presents the process of verifying this particular
controller using our CHARON [2] toolkit. In particular, it
involves modeling and simulation of the system in CHARON, and
verifying the controller using our predicate abstraction technique
for hybrid systems [3].
MS-CIS-02-03
Validating Constraints in XML
Yi Chen, Susan B. Davidson and Yifeng Zheng
MS-CIS-02-04
Constraint Preserving XML Storage in Relations
Yi Chen, Susan B. Davidson and Yifeng Zheng
MS-CIS-02-05
Byron Choi
DTDs have proved important in a variety of areas: transformations
between XML and databases, XML storage,XML publishing, consistency
analysis of XML specifications, typechecking, and optimization
of XML queries. Much of this work depends on certain assumptions
about DTDs, e.g., the absence of recursion and nondeterminism.
With this comes the need to justify these assumptions against
DTDs in the real world. This paper surveys a number of DTDs
collected from the Web, and provides statistics with respect
to a variety of criteria commonly discussed in XML research.
MS-CIS-02-06
Siome Goldenstein, Christian Vogler and Dimitris Metaxas
In this paper we describe how the connections between affine
forms, zonotopes, and Gaussian distributions help us devise
an automated cue integration technique for tracking deformable
models. This integration technique is based on the confidence
estimates of each cue.
We use affine forms to bound these confidences. Affine forms
represent bounded intervals, with a well-defined set of arithmetic
operations. They are constructed from the sum of several independent
components. An n-dimensional affine form describes a complex
convex polytope, called a zonotope. Because these components
lie in bounded intervals, Lindeberg's theorem, a modified version
of the central limit theorem, can be used to justify a Gaussian
approximation of the affine form.
We present a new expectation-based algorithm to find the
best Gaussian approximation of an affine form. Both the new
and the previous algorithm run in O(n^2m) time, where n is
the dimenstion of the affine form, and m is the number of independent
components. The constants in the running time of new algorithm,
however, are much smaller, and as a result it runs 40 times
faster than the previous one for equal inputs.
We show that using the Berry-Esseen theorem it is possible
to calculate an upper bound for the error in the Gaussian approximation.
Using affine forms and the conversion algorithm, we create
a method for automatically integrating cues in the tracking
process of a deformable model. The tracking process is described
as a dynamical system, in which we model the force contribution
of each cue as an affine form. We integrate their Gaussian
approximations using a Kalman filter as a maximum likelihood
estimator. This method not only provides an integrated result
that is dependent on the quality of each one of the cues, but
also provides a measure of confidence in the final result.
We evaluate our new estimation algorithm in experiments, and
we demonstrate our deformable model-based face tracking system
as an application of the algorithm.
MS-CIS-02-07
K.G. Anagnostakis, S. Ioannidis, S. Miltchev, J. Ioannidis,
M. Greenwald, J.M. Smith
Packet monitoring arguably needs the flexibility of open
architectures and active networking. A significant challenge
in the design of open packet monitoring systems is how to effectively
strike a balance between flexibility, safety and performance.
In this paper we investigate the performance of FLAME, a system
that emphasizes flexibility by allowing applications to execute
arbitrary code for each packet received. Our system attempts
to achieve high performance without sacrificing safety by combining
the use of a type-safe language, lightweight run-time checks,
and fine-grained policy restrictions. Experiments with our
prototype implementation demonstrate the ability of our system
to support representative application workloads on Bgit/s links.
Such performance indicates the overall efficiency of our approach;
more narrowly targeted experiments demonstrate that the overhead
required to provide safety is acceptable.
MS-CIS-02-08
Albert Montillo
The 3D medical image segmentation problem typically involves
assigning labels to 3D pixels, called voxels, which comprise
a given medical volume. In its simplest form the segmentation
problem involves
assigning the labels "part of the structure of interest" or "not part of the
structure" to each voxel using locally measured properties and prior knowledge
of human anatomy. Robust segmentation remains an open research problem today
due to the significant challenges in the task including: partial volume averaging,
overlapping intensity distributions and image noise. In the face of these challenges
prior knowledge needs to be added to make the segmentation methods more robust.
Active contours were introduced in the late 1980's mainly to address situations
in which the object to be segmented had a single closed boundary. To address
situations in which the object(s) to be segmented have unknown topology the level
set framework was recently introduced to segment medical images. Unlike active
contours, the level set method relies on an implicit shape representation rather
than an explicit shape representation and hence new methods to impose prior knowledge
about expected shape have to be devised for the new framework. This paper explores
recent segmentation methods from four research groups which address the task
of imposing prior knowledge of shape for object boundary segmentation. Three
of the methods impose priors onto the level set technique and one employs a medial
axis shape representation and statistical shape information to guide a model-based
segmentation. All of the methods include a notion of a statistical shape distribution.
Each method is described, analyzed for its strengths and weaknesses. The paper
concludes with a comparison of all four methods and recommendations for their
applicability and extension.
MS-CIS-02-09
Benedikt Bollig, Martin Leucker, and Philipp Lucas
We extend the formal developments for message sequence charts (MSCs) to support
scenarios with lost and found messages. We define a notion of extended compositional
message sequence charts (ECMSCs) which subsumes the notion of compositional
message sequence charts in expressive power but additionally allows to define
lost and found messages explicitly. As usual, ECMSCs might be combined by means
of choice and repetition towards (extended) compositional message sequence
graphs. We show that---despite extended expressive power---model checking of
monadic second-order logic (MSO) for this framework remains to be decidable.
The key technique to achieve our results is to use an extended notion for linearizations.
MS-CIS-02-10
Quantitative Modeling of Contention in Distributed Algorithms
Michael Greenwald
MS-CIS-02-11
Hyoung S. Hong, Insup Lee, Na Young Lee, Martin Leucker, Oleg Sokolsky
In this report, we summarize our approach for testing the Electronic Throttle
Control (ETC) system. We reformulate the ETC model based on the MATLAB/SIMULINK
model provided by the Berkeley group. We specify the ETC model using the hybrid
modeling language called Charon. From the Charon model, we generate test sequences
based on the control-flow and data-flow criteria. These are transformed into
test cases which may be used to test an implementation of the ETC system.
MS-CIS-02-12
Insup Lee, Anna Philippou, and Oleg Sokolsky
The paper describes a unified formal framework for designing and reasoning
about power-constrained, timed systems. The framework is based on process algebra,
a formalism which has been developed to describe and analyze communicating,
concurrent systems. The proposed extension allows the modeling of probalistic
resource failures, priorities of resource usages, and power consumption by
resources within the same formalism. Thus, it is possible to study several
alternative power-consumption behaviors and tradeoffs in their timing and other
characteristics. This paper describes the modeling and analysis techniques,
and illustrates them with examples, including a dynamic voltage-scaling algorithm.
MS-CIS-02-13
The PEPL Specification
Sotiris Ioannidis
MS-CIS-02-14
Dianna Xu
The classification theorem for compact surfaces is a formidable result. This
result was obtained in the early $1920$'s, and was the culmination of the work
of many. The theorem gives a simple way of obtaining all compact $2$-manifolds,
moreover, as a result of the theorem, it's possible to decide whether or not
any two compact surfaces are homeomorphic rather easily. Before the statement
of the theorem, quite a bit of basic topological concepts are first introduced,
including connectivity, compactness and quotient topology. In addition to that,
a rigorous proof requires, among other things, a precise definition of a surface,
orientability, a notion of generalized triangulation, and a precise way of determining
whether two surfaces are homeomorphic, which requires some notion of algebraic
topology. All
of the above brings together the final proof of the theorem.
MS-CIS-02-15
Kostas G. Anagnostakis, Raphael S. Ryger, Michael B. Greenwald
While network simulations for congestion control studies have often varied
traffic loads and protocol parameters, they have typically investigated only
a few topologies. The most common is by far the so-called ``barbell'' topology.
In this paper we argue, first, that the barbell topology is not representative
of the Internet. In particular, we report that a measurable fraction of packets
pass through multiple congestion points. Second, we argue that the distinction
between the ``barbell'' topology and more complex topologies is relevant by
presenting a scenario with multiple congestion points that exhibits behavior
that seems unexpected based on intuition derived from the barbell topology
(in particular, a TCP-only system that exhibits behavior technically considered
``congestion collapse''). We make the larger argument that the typical methodology
currently accepted for evaluating network protocols is flawed. Finally, we
briefly comment on some issues that arise in designing a simulation methodology
that will be better suited to comparison of network protocol performance.
MS-CIS-02-16
Propagating XML Constraints to Relations
S. Davidson, W. Fan, C. Hara and J. Qin
We present a technique for refining the design of relational storage for
XML data. The technique is based on XML key propagation : given a
set of keys on XML data, what functional dependencies must hold on a relational
view of the data? With the functional dependencies one can convert the relational
design into, e.g. 3NF, BCNF, and thus develop efficient relational storage
for XML data. We provide several algorithms for computing XML key propagation.
One algorithm is to check whether a functional dependency is propagated from
XML keys via a predefined view, which helps one determine whether the design
is in a normal form; the others are to compute a minimum cover for all functional
dependencies on a universal relation given certain XML keys, which provides
guidance for how to design a view. Our experimental results show that these
algorithms are efficient in practice. We also investigate the complexity of
propagating other XML constraints to relations. The ability to compute XML
key propagation is a first step toward establishing the connection between
XML data and its relational representation at the semantic level. Our algorithms
can be incorporated into other relational storage techniques. They can also
be generalized to study transformations of hierarchically structured data such
as scientific databases.
MS-CIS-02-17
Lawrence K. Saul, Daniel D. Lee, Charles L. Isbell, Yann LeCun
We have implemented a real time front end for detecting voiced speech and
estimating its fundamental frequency. The front end performs the signal processing
for voice-driven agents that attend to the pitch contours of human speech and
provide continuous audiovisual feedback. The algorithm we use for pitch tracking
has several distinguishing features: it makes no use of FFTs or autocorrelation
at the pitch period; it updates the pitch incrementally on a sample-by-sample
basis; it avoids peak picking and does not require interpolation in time or
frequency to obtain high resolution estimates; and it works reliably over a
four octave range, in real time, without the need for postprocessing to produce
smooth contours. The algorithm is based on two simple ideas in neural computation:
the introduction of a purposeful nonlinearity, and the error signal of a least
squares fit. The pitch tracker is used in two real time multimedia applications:
a voice-to-midi player that synthesizes electronic music from vocalized melodies,
and an audiovisual Karaoke machine with multimodal feedback. Both applications
run on a laptop and display the user's pitch scrolling across the screen as
he or she sings into the computer.
MS-CIS-02-18
Lawrence K. Saul, and Sam T. Roweis
The problem of dimensionality reduction arises in many fields of information
processing, including machine learning, data compression, scientific visualization,
pattern recognition, and neural computation. Here we describe locally linear
embedding (LLE), an unsupervised learning algorithm that computes low dimensional,
neighborhood preserving embeddings of high dimensional data. The data, assumed
to lie on a nonlinear manifold, is mapped into a single global coordinate system
of lower dimensionality. The mapping is derived from the symmetries of locally
linear reconstructions, and the actual computation of the embedding reduces
to a sparse eigenvalue problem. Notably, the optimizations in LLE - though
capable of generating highly nonlinear embeddings - are simple to implement,
and they do not involve local minima. We describe the implementation of the
algorithm in detail and discuss several extensions that enhance its performance.
The algorithm is applied to manifolds of known structure, as well as real data
sets consisting of images of faces, digits, and lips. We provide extensive
illustrations of the algorithm's performance.
MS-CIS-02-19
Fei Sha, Lawrence K. Saul, and Daniel D. Lee
We derive multiplicative updates for solving the nonnegative quadratic programming
problem in support vector machines (SVMs). The updates have a simple closed
form, and we prove that they converge {monotonically} to the solution of the
maximum margin hyperplane. The updates optimize the traditionally proposed
objective function for SVMs. They do not involve any heuristics such as choosing
a learning rate or deciding which variables to update at each iteration. They
can be used to adjust all the quadratic programming variables {in parallel}
with a guarantee of improvement at each iteration. We analyze the asymptotic
convergence of the updates and show that the coefficients of non-support vectors
decay geometrically to zero at a rate that depends on their margins. In practice,
the updates converge very rapidly to good classifiers.
MS-CIS-02-20
Daniel Rudoy
This paper is concerned with the group-theoretical background of integral
transforms and the role they play in signal processing on the sphere. An overview
of topological groups, measure and representation theories, and an overview
of the Windowed Fourier Transform and the Continuous Wavelet Transform are
presented. A group-theoretical framework within which these transforms arise
is presented. The connection between integral transforms and square-integrable
group representations is explored. The theory is also generalized beyond groups
to homogeneous spaces. The abstract theory is then applied to signal processing
on the sphere with a discussion of both global and local methods. A detailed
derivation of the continuous spherical harmonic transform is presented. Global
methods such as the spherical Fourier transform and convolution are presented
in an appendix as well as some background material from group theory and topology.
MS-CIS-02-21
Kostas G. Anagnostakis and Michael Greenwald
MS-CIS-02-22
Amir Roth, Anne Bracy, and Vlad Petric
Register integration (or just integration) is a register renaming discipline
that implements instruction reuse via physical register sharing. Initially
developed to perform squash reuse, the integration mechanism is a powerful
reuse tool that can exploit more reuse scenarios. In this paper, we describe
three extensions to the initial integration mechanism that expand its applicability
and boost its performance impact. First, we extend squash reuse to general
reuse. Wheras squash reuse maintains the superscalar concept of an instruction
instance "owning" its output physical register we allow multiple instructions
to simultaneously and seamlessly share a single physical register. Next, we
replace the PC-indexing scheme used by squash reuse with an opcode-based
indexing scheme that exposes more integration opportunities. Finally, we
introduce an extension called reverse integration in which we speculatively
create integration entries for the inverses of operations-for instance, when
renaming an add, we create an entry for the inverse subtract. Reverse integration
allows us to reuse operations that were not specified by the original program.
We use reverse integration to obtain a free implementation of speculative memory
bypassing for stack-pointer based loads (register fills and restores).
Our evaluation shows that these extensions increase the integration rate-the
number of retired instructions that integrate older results and bypass the
execution engine-to an average of 17% on the SPEC2000 integer benchmarks. On
a 4-way superscalar processor with an aggressive memory system, this translates
into an average IPC improvement of 8%. The fact that integrating instructions
completely bypass the execution engine raises the possibility of using integration
as a low-complexity substitute for execution bandwidth and issue buffering.
Our experiments show that such a trade-off is possible, enabling a range of
IPC/complexity designs.
MS-CIS-02-23
Amir Roth and Gurindar S. Sohi
Pre-execution attacks cache misses for which conventional address-prediction
driven prefetching is ineffective. In pre-execution, copies of cache miss computations
are isolated from the main program and launched as separate threads called
p-threads whenever the processor anticipates an upcoming miss. P-thread
selection is the task of deciding what computations should execute on p-threads
and when they should be launched such that total execution time is minimized.
P-thread selection is central to the success of pre-execution.
We introduce a framework for automated static p-thread selection,
a static p-thread being one whose dynamic instances are repeatedly launched
during the course of program execution. Our approach is to formalize the problem
quantitatively and then apply standard techniques to solve it analytically.
The framework has two novel components. The slice tree is a new data
structure that compactly represents the space of all possible static p-threads. Aggregate
advantage is a formula that uses raw program statistics and computation
structure to assign each candidate static p-thread a numeric score based on
estimated latency tolerance and overhead aggregated over its expected dynamic
executions. Our framework finds the set of p-threads whose aggregate advantages
sum to a maximum. The framework is simple and intuitively parameterized to
model the salient microarchitecture features.
We apply our framework to the task of choosing p-threads that cover L2 cache
misses. Using detailed simulation, we study the effectiveness of our framework,
and pre-execution in general, under difference conditions. We measure the effect
of constraining p-thread length, of adding localized optimization to p-threads,
and of using various program samples as a statistical basis for the p-thread
selection, and show that our framework responds to these changes in an intuitive
way. In the microarchitecture dimension, we measure the effect of varying memory
latency and processor width and observe that our framework adapts well to these
changes. Each experiment includes a validation component which checks
that the formal model presented to our framework correctly represents actual
execution,
MS-CIS-02-24
DISE: Dynamic Instruction Stream Editing
E Christopher Lewis, Amir Roth, Marc Corliss
MS-CIS-02-25
Byron Choi, Mary Fernandez and Jerome Simeon
XQuery is a strongly typed, functional language, which supports the common
processing, transmation, and querying tasks of a wide variety of XML applications.
Following the tradition of other functional languages. XQuery includes a complete
formal semantics. In this paper, we argue that basing an XQuery implementation
on the XQuery Formal Semantics not only ensures correctness, but is a good
foundation for optimization. We describe an architecture that we have implemented
and that is based on the XQuery Formal Semantics and describe several logical
and physical optimizations that can be easily integrated in the above architecture.
MS-CIS-02-26
Libin Shen and Jinying Chen
The Template Relation (TR) task is an information extraction problem introduced
in the 7th Message Understanding Conference (MUC-7). In this paper, we have
proposed an approach to converting the problem into a discriminative one. We
obtain F-Measure of 78% on sentence-level relation which is comparable to the
best systems presented on MUC-7, while almost no extra annotation work is required.
In our approach, we first use Supertagger and Lightweight Dependency Analyzer
to do shallow parsing on the text. Then features are abstracted from the shallow
parsing result. We use these features as weak hypotheses, and combine them
into the final one with the boosting algorithm.
MS-CIS-02-27
Dynamic Message Sequence Charts
M. Leucker, P. Madhusudan, S. Mukhopadhyay
MS-CIS-02-28
On-line payment method for small amount transactions using hierarchical prepaid
cards
Toru Egashira and Jonathan M. Smith
MS-CIS-02-29
The Influence of ATM on Operating Systems
Jonathan M. Smith
The features of ATM offered many attractions to the application community,
such as fine-grained multiplexing and high-throughput links. These created
considerable challenges for the O.S. designer, since a small protocol data
unit size (the 48 byte "cell") and link bandwidths within a (binary) order
of magnitude of memory bandwidths demanded considerable rethinking of operating
system structure.
Using an historical and personal perspective, this paper describes two aspects
of that rethinking which I participated in directly, namely, those of new event
signalling and memory buffering schemes. Ideas an dtechniques stemming from
ATM network research influenced first research operating systems and then commercial
operating systems. The positive results of ATM networking, although indirect,
have benefitted applications an dsystems far beyond the original design goals.
MS-CIS-02-30
Reflections on Active Networking
Jonathan M. Smith
Interactions among telecommunications networks, computers, and other peripheral
devices have been of interest since the earliest distributed computing systems.
A key architectural question is the location (and nature) of programmability.
One perspective, that examined in this paper, is that network elements should
be as programmable as possible, in order to build the most flexible distributed
computing systems.
This paper presents my personal view of the history of programmable networking
over the last two decades, and in the spirit of "vox audita perit, littera
scripta manet", includes an account of how what is now called "Active Networking" came
into being. It demonstrates the deep roots Active Networking has in the programming
languages, networking and operating systems communities, and shows how interdisciplinary
approaches can have impacts greater than the sums of their parts. Lessons are
drawn both from the broader research agenda, and the specific goals pursued
in the SwitchWare project. I close by speculating on possible futures for Active
Networking.
MS-CIS-02-31
A Hybrid Approach to Estimating per-link Network Queuing Delays
Kostas G. Anagnostakis and Michael B. Greenwald
The network tomography problem requires a remote source to estimate network-internal
statistics on individual links within the network. Several techniques have been
proposed to perform network delay tomography. The most accurate prior technique
directly measured delay by using ICMP Timestamp requests to the head and tail
of the measured link. However, routing irregularities in the Internet significantly
limited the applicability to individual links (the route to the head of the link
had to be a proper prefix of the route to the tail of the link). Alternative
techniques with broader coverage are all either less accurate, require new functionality
in routers, or require the existence of a network-wide measurement infrastructure.
This report describes our attempt to develop a new approach to network delay
tomography. We combine direct measurements of multi-hop segments with indirect
inference methods to estimate delay distributions on individual links. Our approach
is to use a number of probes to directly measure overlapping multi-link segments
in the neighborhood of the target link. The delay distributions on these segments
are used as input to an inference algorithm that derives the queuing delay distribution
on the target link. The method can measure one-way delays, depends only on existing
infrastructure, and is almost universally applicable. The coverage of this technique
is assessed by inspecting many thousands of Internet paths. We report on coverage
as a function of resolution: the finer the resolution requested, the lower the
coverage possible (with measurements of individual hops having least coverage).
Coarse resolution (multi-hop segments) estimates can be produced for almost every
link in the Internet.
Choices of several different computational methods for indirect inference remain.
We are still exploring the relative accuracy of these methods through simulation
(where it is possible to determine with absolute accuracy the real queueing delays).
We present our initial assessments here.
MS-CIS-02-32
Ameesh Makadia, Kostas Daniilidis
Omnidirectional images arising from 3D-motion of a camera contain persistent
structures over a large variation of motions because of their large field of
view. This persistence made appearance-based methods attractive for robot localization
given reference views. Assuming that central omnidirectional images can be mapped
to the sphere, the question is what are the underlying mappings of the sphere
that can reflect a rotational camera motion. Given such a mapping, we propose
a systematic way for finding invariance and the mapping parameters themselves
based on the generalization of the Fourier transform. Using results from representation
theory, we can generalize the Fourier transform to any homogeneous space with
a transitively acting group. Such a case is the sphere with rotation as the acting
group. The spherical harmonics of an image pair are related to each other through
a shift theorem involving the irreducible representation of the rotation group.
We show how to extract Euler angles using this theorem. We study the effect of
the number of spherical harmonic coefficients as well as the effect of violation
of appearance persistence
in real imagery.
MS-CIS-02-33
Heuristics for the Braid Conjugacy Witness Problem: extended abstract
Geoffrey B Milord and Max Kanovich
This extended abstract summarizes a heuristic for the braid conjugacy witness
problem.
MS-CIS-02-34
Rajeev Alur, Thao Dang, and Franjo Ivancic
Predicate abstraction has emerged to be a powerful technique for extracting finite-state
models from infinite-state discrete programs. This report presents algorithms
and tools for reachability analysis of hybrid systems by combining the notion
of counter-example guided predicate abstraction with recent techniques for approximating
the set of reachable states of linear systems using polyhedra. Given a hybrid
system and a set of predicates, we consider the finite discrete quotient whose
states correspond to all possible truth assignments to the input predicates.
The tool performs an on-the-fly exploration of the abstract system. The success
of this scheme crucially depends on the choice of the predicates used for abstraction.
We identify such predicates automatically by analyzing spurious counter-examples
generated by the search in the abstract state space. We present the basic techniques
for guided search in the abstract state space, discovering new predicates that
will rule out closely related spurious counter-examples, optimizations of these
techniques, implementation of these in our verifier, and case studies demonstrating
the promise of the approach.
MS-CIS-02-35
Shallow Parsing with Conditional Random Fields
By Fei Sha and Fernando Pereira
Conditional random fields for sequence labeling offer advantages over both generative
models like HMMs and classifiers applied at each sequence position. Among sequence
labeling tasks in natural language, shallow parsing has received much attention,
with the development of standard evaluation datasets and extensive comparison
among methods. We show here how to train a conditional random field to achieve
performance as good as any reported base noun-phrase chunking method on the CoNLL
task, and better than any reported single model. Improved optimization algorithms
in training were critical to achieving these results. We present extensive comparisons
among training methods that confirm and refine other published results for training
maximum-entropy-style models.
MS-CIS-02-36
Playing Games with Boxes and Diamonds
Rajeev Alur, Salvatore La Torre, P. Madhusudan
|
 |