Working with a Mysql Database under Python was easier than I thought, you just import the ComaptMysqldb module, connect to the desired database, and you are ready to start executing mysql commands, fetching and processing the results without much trouble.
To learn a little bit of SQL (I’m still not really comfortable with SQL syntax) I changed the Python structures I used in the last indexer to mysql tables. The goal was to mimic the last indexer but not performance. So what I did was replace the word dictionary with a SQL table called words, where the primary key is the word and the next field is a unique id for each word, this id used to be a list of files in the previous indexer. Now insted of a list for each word there is just one table called occurs, this table has an entry for every word, it associates the filename with the word id. So looking for a word is as easy as getting the word id and finding all entries that match the id in the occur table, the result is all files that have the word.
This indexer is a little different from the last one. It saves all the words, not just diferent words from diferent files. For example in the previous indexer, if the same word was found many times on the same file only one entry was put in the file list, now there is one entry per occurence. I did this change on purpouse as information on each word needs to be saved.
Next step is to design a good database layout to be able to perform all operations, like finding multi-word strings or being able to search files by name.
The code:
Mysql File Indexer