Something interesting

We are what we repeatedly do. Excellence, therefore, is not an act, but a habit.


Knowledge On Demand

  • ABC
  • Search-Enable Enterprise Knowledge
    • GSA and why it fails
    • Federated Search + P2P Search + Appliance Search
    • Search

Lucene (A necessary but insufficient part)

Other Related Enablers:

MapReduce (Google Search):


  • FMEA quickreference.
  • Jakarta POI is a pure Java library for reading and writing Excel XLS files;
  • And Jackcess project is a pure Java library for reading and writing Access DB files (but only for Access2000). Another pure java implementation is also available, but unfortunately commercial.
  • when talking about robustness, JDBC-ODBC bridge (blog and sample code) stands out on the task.
  • XStream.
  • Byteworx FMEA Software

XP Practices:

Fun Reading:

Text Mining Today (from Text Mining 2003 host@Michael W. Berry and his projects)

  • Information Retrieval: Lucene – Excellent for IR, but not much text mining.
  • Algorithms / Infrastructure: Rain/bow, Weka(also @SF, @Wikipedia, and FAQ; and a funny thing is that WEKA also a.k.a “What Everybody Keeps Asking“), GTP, TDMAPI – no data resources
  • NLP toolkit: NLTK – some data resources
  • Data Resources: WordNet (@Wikipedia), DMOZ – Excellent, but not enough breadth/depth.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: