Using reinforcement learning to continuously improve a document treatment chain

Esther Nicart Bruno Zanuttini Bruno Grilhères Patrick Giroux Arnaud Saval 

Cordon Electronics DS2i, 27000 Val de Reuil, France

Normandie Univ, UNICAEN, ENSICAEN, CNRS, GREYC, 14000 Caen, France

Airbus Defence and Space, Élancourt, France

Corresponding Author Email:
31 December 2017
We model a document treatment chain as a Markov Decision Process, and use reinforcement learning to allow the agent to learn to construct and continuously improve custommade chains “on the fly”. We build a platform which enables us to measure the impact on the learning of various models, web services, algorithms, parameters, etc. We apply this in an industrial setting, specifically to an open source document treatment chain which extracts events from massive volumes of web pages and other open-source documents. Our emphasis is on minimising the burden of the human analysts, from whom the agent learns to improve guided by their feedback on the events extracted. For this, we investigate different types of feedback, from numerical feedback, which requires a lot of tuning, to partially and even fully qualitative feedback, which is much more intuitive, and demands little to no user calibration. We carry out experiments, first with numerical feedback, then demonstrate that intuitive feedback still allows the agent to learn effectively.


artificial intelligence, reinforcement learning, extraction and knowledge management, man-machine interaction, open source intelligence (OSINT)

1. Introduction
2. La plateformeWebLab
3. Apprentissage par renforcement
4. Amélioration continue via l’apprentissage par renforcement
5. Cadre expérimental
6. Mesure de la qualité des résultats
7. Tests avec un feedback numérique
8. Tests avec un feedback intuitif
9. Conclusion et perspectives

Les auteurs veulent remercier Hugo Gilbert pour les fructueuses discussions sur les feedbacks qualitatifs, ainsi que les reviewers anonymes d’IC2015 et de la RIA pour leurs retours constructifs.


