So, I'm doing an assignment where I have to take a text file, analyze the number of words and distinct words, and then print out the most used words and their frequencies. It works like a charm, but when using large files (like we need to handle) my method of searching through the amount of words to see if it is distinct takes ****ing forever. Can anyone help me make this more efficient?
Here's the code (pastebin so it keeps formatting):
Here's the code (pastebin so it keeps formatting):




Comment