CRF Project Page
The CRF package is a java implementation of Conditional Random Fields for
sequential labeling
developed by Sunita Sarawagi of IIT
Bombay. The package is distributed with the hope that it will be
useful for researchers working in information extraction or related areas.
We have attempted to keep the core CRF package compact and barebones
for ease of deployment. However, we have packaged additional
supporting classes for generating features, managing model structure
and dictionary of words in the training data. The best way to learn
how to use this code is to follow the examples in the package for Sequence
annotations and for Maximum
entropy classification Another example of deploying the package
can be seen in William
Cohen's Minorthird
information extraction toolkit.
We believe this is an efficient implementation of CRFs since it
extensively relies on sparse matrix operations and Quasi-Newton
optimization during training. Care is taken to avoid memory
allocations within core training loops. The support for sparse matrix
operations is taken from the COLT
distribution and the Quasi-Newton optimization algorithm (LBFGS)
is taken from riso.numerical.
Useful Links
- The original CRF paper:
John Lafferty and Andrew McCallum and Fernando Pereira,
Conditional Random Fields: Probabilistic
Models for Segmenting and Labeling Sequence Data, "Proceedings of the International Conference on
Machine Learning (ICML-2001)",2001
-
The code follows notations and algorithm as described in this paper:
F. Sha and F. Pereira, Shallow
parsing with conditional random fields, "In Proceedings of
HLT-NAACL", "2003"
- The code also includes support for the semi-CRF learner as described in this
NIPS paper. Sunita Sarawagi and William W. Cohen. Semi-markov conditional random fields for information extraction. In NIPs, 2004.
and,
Sunita Sarawagi.
Efficient inference on sequence segmentation models.
In Proceedings of the 23rd International
Conference on Machine Learning (ICML), Pittsburgh, PA, USA, 2006
To learn more
Download the code It is recommended that you checkout the latest version using CVS rather than depend on the
release which could be out of date.
View the documentation
Browse the API documentation (Javadoc)
Go over the FAQ
Send questions to crf-users[at]lists.sourceforge.net. Browse archive
Contributors
The following are names of current and past contributors to the project.
- Amit Jaiswal
- Sandesh Tawari
- Imran Mansuri
- Kaushal Mittal
- Charu Tiwari
Thanks to Sourceforge for hosting this.