Deblina Bhattacharjee, Ph.D.

ARTIFICIAL INTELLIGENCE RESEARCHER
COMPUTER VISION ENTHUSIAST

About me

I have completed my Ph.D. at the Image and Visual Representation Lab in EPFL, Lausanne, Switzerland. I am simply grateful to have been mentored by the likes of Prof. Sabine Süsstrunk, Dr. Mathieu Salzmann, Prof. Pascal Fua and Prof. Anand Paul . I love to work on interdisciplinary subjects, currently applying deep learning to reconfigure 19th to 20th century comics to digital media. I have also worked with satellites and with the evolutionary intelligence mechanism of bilogical plants (during my Masters).

I am passionate about learning, teaching and sharing my ideas about artificial intelligence and machine learning.

I volunteer for Teach for India and EduCare (a non-profit initiative for teaching children and empowering girls). I am also actively supporting and volunteering as a mentor at India STEM Foundation where I host workshops and mentor teams. Through these efforts, I aim to bring STEM to communities inclusive of all representations- gender, sexual orientations, nationality, race etc.

News

  • Organiser of Women in Computer Vision Workshop (WiCV) at CVPR 2024, Seattle.

    Website

  • Accepted work to ICCV 2023, Paris- Vision Transformer Adapters for Generalizable Multitask Learning.

    Project page

  • Accepted work to CVPR 2023 proceedings, Vancouver- Dense Multitask Learning to Reconfigure Comics.

  • Reviewer for CVPR 2024 and 2023, ECCV 2024, ACCV 2024, ICCV 2023, WACV 2023 and 2024.

  • Successfully defended my Ph.D. thesis in Computer Science at EPFL. Jury comprised Prof. Andrew Zisserman (Oxford, Deepmind), Prof. Sara Beery (MIT, Google), Prof. Wenzel Jakob (EPFL), Prof. Pascal Fua (EPFL), Prof. Ed Bugnion (EPFL), and Prof. Mathieu Salzmann (EPFL).

  • Awarded the Teaching Assistant Award 2022 for teaching excellence in Computational Photography by EPFL.

  • Accepted work to Transactions on Machine Learning Research (TMLR 2022)- Modeling Object Dissimilarity for Deep Saliency Prediction.

    Project page

  • Accepted work to CVPR 2022, New Orleans- MulT: An End-to-End Multitask Learning Transformer.

    Project page

  • Accepted work to WACV 2022, Hawaii- Estimating Image Depth in the Comics Domain.

    Project page

  • Accepted work to Signal Processing Letters 2021- Fidelity Estimation Improves Noisy-Image Classification with Pretrained Networks.

    Full code

  • I am honoured to take part in the podcast Deep Learning for Machine Vision with 15K+ views.

    Listen here

  • I am honoured to take part in the podcast Solving an Optimization Problem with a Custom Built Algorithm with 10K+ views.

    Listen here

Publications

Vision Transformer Adapters for Generalizable Multitask Learning

Deblina Bhattacharjee, Sabine Süsstrunk, Mathieu Salzmann
ICCV, 2023, Paris, France
We introduce the first multitasking vision transformer adapters that learn generalizable task affinities which can be applied to novel tasks and domains. Integrated into an off-the-shelf vision transformer backbone, our adapters can simultaneously solve multiple dense vision tasks in a parameter-efficient manner, unlike existing multitasking transformers that are parametrically expensive. In contrast to concurrent methods, we do not require retraining or fine-tuning whenever a new task or domain is added. We introduce a task-adapted attention mechanism within our adapter framework that combines gradient-based task similarities with attention-based ones. The learned task affinities generalize to the following settings: zero-shot task transfer, unsupervised domain adaptation, and generalization without fine-tuning to novel domains. We demonstrate that our approach outperforms not only the existing convolutional neural network-based multitasking methods but also the vision transformer-based ones.

Dense Multitask Learning To Reconfigure Comics

Deblina Bhattacharjee, Sabine Süsstrunk, Mathieu Salzmann
CVPR (Workshop proceedings), 2023, Vanvouver, Canada
In this paper, we develop a MultiTask Learning (MTL) model to achieve dense predictions for comics panels to, in turn, facilitate the transfer of comics from one publication channel to another by assisting authors in the task of reconfiguring their narratives. Our MTL method can successfully identify the semantic units as well as the embedded notion of 3D in comics panels. This is a significantly challenging problem because comics comprise disparate artistic styles, illustrations, layouts, and object scales that depend on the author’s creative process. Typically, dense image-based prediction techniques require a large corpus of data. Finding an automated solution for dense prediction in the comics domain, therefore, becomes more difficult with the lack of ground-truth dense annotations for the comics images. To address these challenges, we develop the following solutions- we leverage a commonly used strategy known as unsupervised image-to-image translation, which allows us to utilize a large corpus of real-world annotations; - we utilize the results of the translations to develop our multitasking approach that is based on a vision transformer backbone and a domain transferable attention module; -we study the feasibility of integrating our MTL dense-prediction method with an existing retargeting method, thereby reconfiguring comics.

MulT: An End-to-End Multitask Learning Transformer

Deblina Bhattacharjee, Tong Zhang, Sabine Süsstrunk, Mathieu Salzmann
CVPR, 2022, New Orleans, USA
We propose an end-to-end Multitask Learning Transformer framework, named MulT, to simultaneously learn multiple high-level vision tasks, including depth estimation, semantic segmentation, reshading, surface normal estimation, 2D keypoint detection, and edge detection. Based on the Swin transformer model, our framework encodes the input image into a shared representation and makes predictions for each vision task using task-specific transformer-based decoder heads. At the heart of our approach is a shared attention mechanism modeling the dependencies across the tasks. We evaluate our model on several multitask benchmarks, showing that our MulT framework outperforms both the state-of-the art multitask convolutional neural network models and all the respective single task transformer models. Our experiments further highlight the benefits of sharing attention across all the tasks, and demonstrate that our MulT model is robust and generalizes well to new domains.

Modeling Object Dissimilarity for Deep Saliency Prediction

Bahar Aydemir* (equal contribution), Deblina Bhattacharjee* (equal contribution), Seungryong Kim, Tong Zhang, Mathieu Salzmann, Sabine Süsstrunk
Transactions on Machine Learning Research (TMLR 2022)
Saliency prediction has made great strides over the past two decades, with current techniques modeling low-level information, such as color, intensity and size contrasts, and high-level one, such as attention and gaze direction for entire objects. Despite this, these methods fail to account for the dissimilarity between objects, which humans naturally do. In this paper, we introduce a detection-guided saliency prediction network that explicitly models the differences between multiple objects, such as their appearance and size dissimilarities. Our approach is general, allowing us to fuse our object dissimilarities with features extracted by any deep saliency prediction network. As evidenced by our experiments, this consistently boosts the accuracy of the baseline networks, enabling us to outperform the state-of-the-art models on three saliency benchmarks, namely SALICON, MIT300 and CAT2000.

Estimating Image Depth in the Comics Domain

Deblina Bhattacharjee, Martin Everaert, Mathieu Salzmann, Sabine Süsstrunk
WACV, 2022, Hawaii, USA
Estimating the depth of comics images is challenging as such images a)are monocular; b) lack ground-truth depth annotations; c) differ across different artistic styles; d) are sparse and noisy. We thus, use an off-the-shelf unsupervised image to image translation method to translate the comics images to natural ones and then use an attention-guided monocular depth estimator to predict their depth. This lets us leverage the depth annotations of existing natural images to train the depth estimator. Furthermore, our model learns to distinguish between text and images in the comics panels to reduce text-based artefacts in the depth estimates. Our method consistently outperforms the existing state-ofthe-art approaches across all metrics on both the DCM and eBDtheque images. Finally, we introduce a dataset to evaluate depth prediction on comics.

Fidelity Estimation Improves Noisy-Image Classification With Pretrained Networks

Xiaoyu Lin, Deblina Bhattacharjee, Majed El Helou, Sabine Süsstrunk
IEEE Signal Processing Letters, 2021.
Image classification has significantly improved using deep learning. This is mainly due to convolutional neural networks (CNNs) that are capable of learning rich feature extractors from large datasets. However, most deep learning classification methods are trained on clean images and are not robust when handling noisy ones, even if a restoration preprocessing step is applied. While novel methods address this problem, they rely on modified feature extractors and thus necessitate retraining. We instead propose a method that can be applied on a pretrained classifier. Our method exploits a fidelity map estimate that is fused into the internal representations of the feature extractor, thereby guiding the attention of the network and making it more robust to noisy data. We improve the noisy-image classification (NIC) results by significantly large margins, especially at high noise levels, and come close to the fully retrained approaches. Furthermore, as proof of concept, we show that when using our oracle fidelity map we even outperform the fully retrained methods, whether trained on noisy or restored images.

DUNIT- Detection based Unsupervised Image to Image Translation

Deblina Bhattacharjee, Seungryong Kim, Guillaume Vizier, Mathieu Salzmann
CVPR, 2020, Seattle, USA
Image-to-image translation has made great strides in recent years, with current techniques being able to handle unpaired training images and to account for the multi-modality of the translation problem. Despite this, most methods treat the image as a whole, which makes the results they produce for content-rich scenes less realistic. In this paper, we introduce a Detection-based Unsupervised Image-to-image Translation (DUNIT) approach that explicitly accounts for the object instances in the translation process. To this end, we extract separate representations for the global image and for the instances, which we then fuse into a common representation from which we generate the translated image. This allows us to preserve the detailed content of object instances, while still modeling the fact that we aim to produce an image of a single consistent scene. We introduce an instance consistency loss to maintain the coherence between the detections. Furthermore, by incorporating a detector into our architecture, we can still exploit object instances at test time. As evidenced by our experiments, this allows us to outperform the state-of-the-art unsupervised image-to-image translation methods. Furthermore, our approach can also be used as an unsupervised domain adaptation strategy for object detection, and it also achieves state-of-the-art performance on this task.

Image Analysis using a novel learning algorithm based on Plant Intelligence

Deblina Bhattacharjee
NeurIPS / WiML 2017 workshop, Long Beach, California, USA
Abstract content goes here...

An Immersive Learning Model Using Evolutionary Learning

Deblina Bhattacharjee, Anand Paul, J.H. Kim and P. Karthigaikumar
Elsevier Computers and Electrical Engineering (CAEE) Journal (2017)
In this article, we have proposed an educational model using virtual reality on a mobile platform by personalizing the simulated environments as per user actions. We have also proposed an evolutionary learning algorithm based on which the user learning path is designed and the corresponding simulated learning environment is modified. The main objective of this study is to create a personalized learning path for each student as per their calibre and make the learning immersive and retainable using virtual reality. Our proposed model emulates the innate natural learning process in humans and uses that to customize the virtual simulations of the lessons by applying the evolutionary learning technique. A quasi-experimental study is conducted by taking different case studies to establish the effectiveness of our learning model. The results show that our learning model is immersive and gives long term retention while enhancing creativity through reinforced customization of the simulations.

A Leukocyte Detection technique in Blood Smear Images using Plant Growth Simulation Algorithm

Deblina Bhattacharjee and Anand Paul
AAAI 2017: 31st Association for the Advancement of Artificial Intelligence, San Francisco,USA, February 03~09 2017
For quite some time, the analysis of leukocyte images has drawn significant attention from the fields of medicine and computer vision alike where various techniques have been used to automate the manual analysis and classification of such images. Analysing such samples manually for detecting leukocytes is time-consuming and prone to error as the cells have different morphological features. Therefore, in order to automate and optimize the process, the nature-inspired Plant Growth Simulation Algorithm (PGSA) has been applied in this paper. An automated detection technique of white blood cells embedded in obscured, stained and smeared images of blood samples has been presented in this paper which is based on a random bionic algorithm and makes use of a fitness function that measures the similarity of the generated candidate solution to an actual leukocyte. As the proposed algorithm proceeds the set of candidate solutions evolves, guaranteeing their fit with the actual leukocytes outlined in the edge map of the image. The experimental results of the stained images and the empirical results reported validate the higher precision and sensitivity of the proposed method than the existing methods. Further, the proposed method reduces the feasible sets of candidate points in each iteration, thereby decreasing the required run time of load flow, objective function evaluation, thus reaching the goal state in minimum time and within the desired constraints.

A Hybrid Search Optimization Technique Based on Evolutionary Learning in Plants

Deblina Bhattacharjee and Anand Paul
Springer LNCS and Proceedings of the Seventh International Conference on Swarm Intelligence (ICSI), Bali, Indonesia, June 24~30 2016
In this article, we have proposed a search optimization algorithm based on the natural intelligence of biological plants, which has been modelled using a three tier architecture comprising Plant Growth Simulation Algorithm (PGSA), Evolutionary Learning and Reinforcement Learning in each tier respectively. The method combines the heuristic based PGSA along with Evolutionary Learning with an underlying Reinforcement Learning technique where natural selection is used as a feedback. This enables us to achieve a highly optimized algorithm for search that simulates the evolutionary techniques in nature. The proposed method reduces the feasible sets of growth points in each iteration, thereby reducing the required run times of load flow, objective function evaluation, thus reaching the goal state in minimum time and within the desired constraints.

An object localization optimization technique in medical images using plant growth simulation algorithm

Deblina Bhattacharjee, Anand Paul, J. H. Kim and M. Kim
Springer Plus Journal (2016) Volume 5, Number 1784, pages: 1-20
The analysis of leukocyte images has drawn interest from fields of both medicine and computer vision for quite some time where different techniques have been applied to automate the process of manual analysis and classification of such images. Manual analysis of blood samples to identify leukocytes is time-consuming and susceptible to error due to the different morphological features of the cells. In this article, the nature-inspired plant growth simulation algorithm has been applied to optimize the image processing technique of object localization of medical images of leukocytes. This paper presents a random bionic algorithm for the automated detection of white blood cells embedded in cluttered smear and stained images of blood samples that uses a fitness function that matches the resemblances of the generated candidate solution to an actual leukocyte. The set of candidate solutions evolves via successive iterations as the proposed algorithm proceeds, guaranteeing their fit with the actual leukocytes outlined in the edge map of the image. The higher precision and sensitivity of the proposed scheme from the existing methods is validated with the experimental results of blood cell images. The proposed method reduces the feasible sets of growth points in each iteration, thereby reducing the required run time of load flow, objective function evaluation, thus reaching the goal state in minimum time and within the desired constraints.

Autonomous Terrestrial Image Segmentation and Sensor Node Localization for Disaster Management using Plant Growth Simulation Algorithm

Deblina Bhattacharjee, Anand Paul, WH Hong, HC Seo, S Karthik
preprints.org
The use of unmanned aerial vehicle (UAV) during emergency response of a disaster has been widespread in recent years and the terrain images captured by the cameras on board these vehicles are significant sources of information for such disaster monitoring operations. Thus, analyzing such images are important for assessing the terrain of interest during such emergency response operations. Further, these UAVs are mainly used in disaster monitoring systems for the automated deployment of sensor nodes in real time. Therefore, deploying and localizing the wireless sensor nodes optimally, only in the regions of interest that are identified by segmenting the images captured by UAVs, hold paramount significance thereby effecting their performance. In this paper, the highly effective nature-inspired Plant Growth Simulation Algorithm (PGSA) has been applied for the segmentation of such terrestrial images and also for the localization of the deployed sensor nodes. The problem is formulated as a multi-dimensional optimization problem and PGSA has been used to solve it. Furthermore, the proposed method has been compared to other existing evolutionary methods and simulation results show that PGSA gives better performance with respect to both speed and accuracy unlike other techniques in literature.

Evolutionary Reinforcement Learning based Search Optimization

Deblina Bhattacharjee
SAC 2016: Proceedings of the 31st Annual ACM Symposium on Applied Computing, Pisa, Italy, April 4~8 2016. Publisher: ACM
Nature has always inspired researchers to find the best solutions to the toughest of problems. In this article, we proposed a search optimization algorithm based on a refined Plant Growth Simulation Algorithm (PGSA) that uses reinforcement learning. The method combines the heuristic based PGSA with reinforcement learning techniques where natural selection is used as a feedback, thus combining evolutionary algorithms with learning. This enables us to achieve a highly optimized algorithm for growth point search that simulates the evolutionary techniques seen in nature. The proposed method reduces the feasible sets of growth points in each iteration, thereby reducing the required run times of load flow, objective function evaluation and morphactin concentration calculation.

Adaptive Transcursive Algorithm for Depth Estimation in Deep Learning Networks

Uthra Kunathur Thikshaja, Anand Paul, Seungmin Rho and Deblina Bhattacharjee
2016 International Conference on Platform Technology and Service (PlatCon), Jeju, South Korea, Feb 15~17 2016.
Estimation of depth in a Neural Network (NN) or Artificial Neural Network (ANN) is an integral as well as complicated process. In this article, we propose a way of using the transformation of functions combined with recursive nature to have an adaptive, transcursive algorithm to represent the backpropagation concept used in deep learning for a Multilayer Perceptron Network. Each function can be used to represent a hidden layer used in the neural network and they can be made to handle a complex part of the processing. Whenever an undesirable output occurs, we transform (modify) the functions until a desirable output is obtained. We have an algorithm that uses the transcursive model to create an interpretation of the concept of deep learning using multilayer perceptron network (MPN).

Talks

  • Poster presentation at CVPR 2022 on Multitask Learning in Computer Vision, New Orleans, USA, 2022.
  • Invited talk on the Evolution of Computer Vision in recent years: Convolutional Neural Network to Transformers and Self-supervised Learning at Synapse, Milan, 2022.
  • Oral and poster presentation at WACV 2022 on Learning Image Depth in Comics Domain, Hawaii, USA, 2022.
  • Poster presentation at CVPR 2020 on Detection based Unsupervised Image to Image Translation, Seattle, USA, 2020.
  • Invited speaker about Future Trends in Deep Learning, at International Conference on Data Science and Big Data Analytics, on May 24-25, 2018 in Toronto, Canada.
  • Poster presentation at WiML (Women in Machine Learning) workshop, Long Beach, California, USA, 2017.
  • Advances in Deep Learning, CiTE, Samsung Intelligent Media Research Centre,Postech, South Korea, 2017.
  • Oral presentation at AAAI about Plant Intelligence and how it can be used to optimize Machine Learning, San Francisco, USA, 2017.
  • Oral presentation at ICSI, Bali, Indonesia, 2016.
  • Oral presentation and poster presentation at ACM SAC, Italy, 2016.
  • Talk at Platcon, Jeju, South Korea, 2017.
  • Departmental Talk on Vision, Deep Learning and Optimization, Kyungpook National University 2016.
  • Oral presentation at Computer Science and Engineering department, Kyungpook National University, 2015.
  • Oral presentation at the project grant proposal meetings, Daegu, South Korea, 2015-2017.

Academic Services

Reviewer of Machine Learning, Signal Processing and IoT journals and conferences:

  • Reviewer of proceedings of IEEE/ CVF Winter Conference on Applications of Computere Vision, WACV.
  • Reviewer of proceedings of IEEE Women in Machine Learning.
  • Reviewer of IEEE Transactions on Signal and Information Processing.
  • Reviewer of Elsevier Computers And Electrical Engineering .
  • Reviewer of IEEE Intelligent Transport Systems [invited].
  • Reviewer of Thomas and Francis Behaviour and Information Technology.
  • Reviewer for Springer Plus Journal.
  • Reviewer of IEEE Transactions on Emerging Topics in Computing.
  • Reviewer of Springer Cluster Computing- The Journal of Networks, Software Tools and Applications.
  • Reviewer of proceedings of ACM SAC 2016, 2017.

Teaching and Research Supervision:

  • Supervised 2 successful Master Theses at EPFL on image translation and dense multi-task learning. Past students are working at Google.
  • Supervised 5 research projects leading to 3 publications.
  • Head teaching assistant for Computational Photography CS-413 at EPFL.

Member of organizations:

  • Member of Association for the Advancement of Artificial Intelligence (AAAI).
  • Member of International Machine Learning Society (IMLS).
  • Member of Computer Society of India.
  • Member of Association for Computing Machinery.

AWARDS

  • Swiss National Science Foundation Sinergia Grant, Switzerland 2019-2023.
  • Women in Machine Learning (WIML) Grant, 2017.
  • ACM SAC SRC best student paper nomination -top 5 globally, 2016.
  • ACM SIGAPP Travel Award, Italy, 2016.
  • KNU International Student Ambassador 2017- present.
  • Brain Korea 21 Plus grant for research, Kyungpook National University, awarded to top 1% of the applicants in Department of Computer Science Engineering 2015-2017.
  • Awarded full merit scholarship by Kyungpook National University, 2015-2017 (4 Semesters).
  • Christ University Merit Scholarship - all 8 semesters, 2011-2015. Dean's List.
  • MS Artificial Intelligence, Offer of Study, from New York University, 2015.
  • MBA Business Analytics, Offer of Study, from University of Tampa, Florida, USA, 2015.
  • BS Computer Science Engineering (transfer), Offer of Study, from University of Rochester, USA 2013.
  • Best Overall Performer of the Year 2012 of all undergraduate and postgraduate students, Christ University.
  • Runner-up International Science Debate Competition by Quanta, November 2009.
  • Won Gold medal and ranked 1st in National Cyber Olympiad in India, 2005.
  • Awarded Distinction in Macmillan International Assessment, University of New South Wales, Australia, 2004-2006.

Outreach

  • Sponsored and Mentored a team of middle school children for the First Lego League in India. Click to know more.

  • Volunteered as a Mentor at the Robo Siksha Kendra (the Non-profit Robotics school of India STEM Foundation) event for encouraging kids specially girls in STEM and AI. Click to know more

  • Organized a preparation workshop on Computer Science for International Robotics Competition at India STEM Foundation. Attended by 350 children aged between 6-12 years. Mentored an all-girls team for the International Robotics Competition.

Further Work

  • 3D reconstruction of objects in comics domain: A work done as part of the Swiss National Science Foundation Sinergia project. Video link