Now that you know the overall shape of how to train and use machine learning models, it's worth asking:
More data --> better performance.
One problem:
Example: how-to-make-a-racist-ai-without-really-trying.ipynb
ConceptNet Numberbatch 17.04: better, less-stereotyped word vectors
Semantics derived automatically from language corpora contain human-like biases
Image Representations Learned With Unsupervised Pre-Training Contain Human-like Biases
racist data destruction? a Boston housing dataset controversy