Tomáš Souček

Postdoctoral Researcher at Meta FAIR

6 Rue Ménars, Paris, France
soucek (at) meta.com

github     linkedin     scholar

I am a researcher at Meta FAIR (Fundamental AI Research) in Paris, working on watermarking and adversarial AI.

Previously, I studied PhD at IMPACT (CTU/CIIRC), working under the supervision of Josef Šivic (IMPACT), Ivan Laptev (Inria), and Dima Damen (University of Bristol, Deepmind), focusing on weakly supervised learing from videos and diffusion models. I also led a team of researchers, engineers, and robotics experts responsible for developing a universal bin-picking solution at SANEZOO.

Before the covid times, I worked as a Senior Researcher at Avast AI lab, working with large transformer neural networks. I got a Master's degree in Artificial intelligence at Charles University. I also spent three months as a Visiting Researcher at the University of Bristol.

Selected Publications

T. Souček, P. Gatti, M. Wray, I. Laptev, D. Damen, and J. Šivic
ShowHowTo: Generating Scene-Conditioned Step-by-Step Visual Instructions
CVPR'25
[website] [paper] [code]
G. Ponimatkin, M. Cífka, T. Souček, M Fourmy, Y. Labbé, V. Petrík, and J. Šivic
6D Object Pose Tracking in Internet Videos for Robotic Manipulation
ICLR'25
[website] [paper] [code]
T. Souček, D. Damen, M. Wray, I. Laptev, and J. Šivic
GenHowTo: Learning to Generate Actions and State Transformations from Instructional Videos
CVPR'24
[website] [paper] [code]
T. Souček, JB. Alayrac, A. Miech, I. Laptev, and J. Šivic
Multi-Task Learning of Object States and State-Modifying Actions from Web Videos
TPAMI'24
[website] [paper] [code]
T. Souček, JB. Alayrac, A. Miech, I. Laptev, and J. Šivic
Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos
CVPR'22
[website] [paper] [code] [dataset]
T. Souček, and J. Lokoč
TransNet V2: An effective deep network architecture for fast shot transition detection
Arxiv'20
[paper] [code]
J. Lokoč, G. Kovalčík, T. Souček, J. Moravec, and P. Čech
A Framework for Effective Known-item Search in Video
ACM MM'19
[paper]

See the list of all publications on Google Scholar.

Other Selected Projects

GRASP robot bin picking product
Project and research lead at SANEZOO
Random bin picking system using industrial 6DOF robot arm.
6D object pose detection from stereo images with sub-millimeter accuracy
Lead researcher at SANEZOO
Fast incident detection by detecting anomalies in user requests
Project leader at Avast
Modeling user requests in time as a probability distribution of distributions in high dimensional space of Transformer-based neural networks.
Autoregressive action-conditioned 3D human motion synthesis using latent discrete codes
Thesis of Jan Waltl, co-advised with J. Šivic
[text]
Measuring advertisement reach using smart glasses
Lead investigator for a joint project of Charles University and MEDIAN (media and public opinion research agency)
Ad detection and tracking in videos from smart glasses for automatically measuring ad reach.

Miscellaneous

Reviewer for CVPR, ICCV, ECCV, SIGGRAPH, and others.

Currently supervising two master students at Masaryk University.

Teaching Assistant for Deep Learning and Intro to Algorithms.