I have been thinking about this myself. I'm working on some custom dictionaries for words I discover from my corpus of movie subtitles. Which I'm sure is not a new idea, but it's fun, because it gives me a dictionary that only contains the words that people "actually use", and with "real" example sentences. (words in quotes because movie dialogue isn't 100% as real as I'd like.)
I'm sure this is not a remotely new idea, but I'm having fun with it. I also like that I can see how common every form of every word is. I was surprised to learn that almost none of the most common words are nouns. And in my internal tools I can filter by movies released a certain date to track changes, which is neat.
if your movie collection is big enough that might be really useful for language learning. Create your own frequency lists and common phrases.
I would be curious how it stacks up against the written word.
I mean all words were added to a dictionary because someone was using them. It's just that they may not be used by people in your particular region or time.
I have been thinking about this myself. I'm working on some custom dictionaries for words I discover from my corpus of movie subtitles. Which I'm sure is not a new idea, but it's fun, because it gives me a dictionary that only contains the words that people "actually use", and with "real" example sentences. (words in quotes because movie dialogue isn't 100% as real as I'd like.)
I'm sure this is not a remotely new idea, but I'm having fun with it. I also like that I can see how common every form of every word is. I was surprised to learn that almost none of the most common words are nouns. And in my internal tools I can filter by movies released a certain date to track changes, which is neat.
if your movie collection is big enough that might be really useful for language learning. Create your own frequency lists and common phrases. I would be curious how it stacks up against the written word.
That's exactly what I'm using it for! And my movie collection is pretty big. I have about 1000 movies. (Many are translated, but also many are not.)
I mean all words were added to a dictionary because someone was using them. It's just that they may not be used by people in your particular region or time.
https://archive.is/Bt6vB