Biographical Information

Daniel Vasicek, Senior Data Scientist, Access Innovations, Inc.

Articles by Daniel Vasicek, Senior Data Scientist, Access Innovations, Inc.

Data Uncertainty, Model Uncertainty, and the Perils of Overfitting

Why should you be interested in artificial intelligence (AI) and machine learning? Any classification problem where you have a good source of classified examples is a candidate for AI. Historically, optical character recognition (OCR) was a difficult problem. We have recently experienced enormous improvement in the performance of OCR because, at least in part, we have a very large collection of already classified examples. Similarly, automatic translation between languages has made tremendous advances because we have access to enormous collections of translated documents that can be used to train the classifier. Other contexts that seem to recommend themselves to machine intelligence and AI learning are concept identification in texts, entity extraction, assigning peer reviewers to submitted documents, sentiment analysis, quality evaluation, and priority assignment.