1. Full Text Search System
Create an index based full text search system for text files. The system will have to be made up of the following programs:
Index
Usage: Index <path to text file>
This program will update an index file in the same folder as the executable with the words in the text file passed in as the argument. The index file will contain the following type of information:
Word1 is contained in file1, file2, file3
Word2 is contained in file3, file4, file7 ...
WordN is contained in file1, file3, file100
The exact format of the index file is an exercise for the candidate. The format strongly influences efficiency and performance so there are multiple options available here.
Search
Usage: Search <word1> <word2> etc.
This program will scan the index file for the word(s) passed in the command line and display the names of the file(s) that contain these words. The following is a sample usage and output:
$ Search candle in the wind dought
1. Searching for 'Candle' ...
Found in:
wax.txt
2. Searching for 'in' ...
Found in:
wax.txt
Elton.txt
3. Searching for 'the' ...
Found in:
wax.txt
Elton.txt
4. Searching for 'wind' ...
Found in:
Elton.txt
5. Searching for 'dought' ...
No matches found.
Once you have the basic functionality implemented, try to do the following bonus exercises:
- Incorporate a check in the Index program that will reject binary files.
- Grade search results based on number of occurrences so that files with the most occurences will be listed on top.
- Add support conditional searching. e.g.
candle AND wind - Add support searching for phrases. e.g.
"candle in the wind"
Note
Do not use third-party libraries that provide out-of-the-box solutions for indexing files and searching for keywords. The idea behind any code challenge is to test the candidate's logical thinking. Therefore, use standard data structures available in C# and the usual file/data formats like XML, JSON, CSV etc or any self-contained binary format.