Language

The Free and Open Productivity Suite
Released: Apache OpenOffice 4.1.15

Summer of Code: Keli Hu

-Louis Suárez-Potts

2005-09


Google's Summer of Code program (SoC) was "designed to introduce students to the world of open source software development", and it worked. Successful participants were given a cash reward, but from the accounts provided by those participants, the cash was the least of the reward. The knowledge acquired and community participated in came in first. OpenOffice.org was and is not only a fascinating if challenging codebase to work with but the community made the effort fun, stimulating, and in the words of one respondent, "the best summer of my life."

Over the next week and half, I'll be publishing a short series of accounts by the participants. To read other accounts, visit the Articles page. Today, Keli Hu discusses the work done during his Google Summer of Code. As in the previous interview, I asked a set of questions via email over a period of several weeks, the first asking about the Google SoC project, the remainder follow-ups.

Tell us about yourself and how you got interested and involved in OpenOffice.org....

I'm Keli Hu, a 1st year Ph.D. student from Center for Language Information Processing, Beijing Language and Culture University (CLIP, BLCU), China.

What got you interested in OpenOffice.org?

Many Natural Language Processing (NLP) techniques can be combined with the office suite, and I think there is still a lot to do in this area. I know how important OOo is to the open-source software community, and I think joining OOo development the best way to make use of my knowledge and contribute to the community.

Describe your project....

My project concerns a grammar checker for OpenOffice.org, with Thomas Lange as my mentor. This project was not on the list of proposed Summer of Code (SoC) projects for OpenOffice.org, so I was a little surprised that it was accepted. What I proposed was a standalone grammar checker with English and Chinese support which could be integrating into OOo later. Because I really don't have any experience in OOo development before, and OOo is such a big project, I was not sure how far I could go in two months if I dug into OOo development.

Things didn't go quite well, really. At first I was distracted by some academic work, so I did not start until July 15th. After some discussion with my mentor, we agreed that we should first find out about an interface, and at the same time I should set up a SDK and compile a few examples. I felt more worried later because of a few things: 1. Because the grammar checker is probably going to be a UNO component like the spell checker, the programming languages I could choose were limited to C++ and Java, while I was somewhat expecting to use Python because I was more productive in Python. Anyway, I chose C++. 2. License issue. I can't use GPL sources, and basically I should be copyright holder of my code, otherwise the investigation of license issue would probably be even longer than the SoC initiative. [OpenOffice.org is actually LGPL. -Ed.]

Such being the case, it's almost impossible to do what I proposed in a short period of time. I'd rather spend more time on OOo and write the grammar checker from scratch, make a simpler implementation and improve it gradually after SoC. After all, getting a grant to do open source software is just a good point to start, but definitely not my goal.

I spent quite some time on investigating existing grammar checkers, and alternative approaches of implementing different components in the grammar checker. Fortunately my mentor decided that I can pick up a simple interface first, so I don't have to worry about that for now, maybe because the discussion about interface I started at the mailing list didn't generate much interest. :-(

Setting up the SDK seemed to be easy at first, until I started to use libraries outside of OOo. To cut it short, the ABI change from gcc3.3 to gcc3.4/4.0 caused me a lot of trouble. After that settled, everything went much more smoothly, and just a few days ago [as of end of August --Ed.] I completed a rather simplistic implementation. Now the SoC is drawing a close, I have a few more days to improve the code, but things other than that are out of my control. Anyway I've done what I could, and I really learned a lot. Great thanks to Google and OpenOffice.org, and my mentor Thomas Lange.

Thanks to Keli and Google for sponsoring the (Northern Hemisphere) Summer of Code! In the next few days, leading up to OOoCon 2005, I'll publish other interviews of student developers who had a great summer with OpenOffice.org thanks to Google's Summer of Code.



Return to Articles

Apache Software Foundation

Copyright & License | Privacy | Contact Us | Donate | Thanks

Apache, OpenOffice, OpenOffice.org and the seagull logo are registered trademarks of The Apache Software Foundation. The Apache feather logo is a trademark of The Apache Software Foundation. Other names appearing on the site may be trademarks of their respective owners.