Posts by Collection

publications

Dynamic 3D Scene Reconstruction from Classroom Videos

Published in IEEE Signal Processing Society 58th Asilomar Conference on Signals, Systems, and Computers, 2024

The paper describes the development of a system for estimating 3D speaker geometry from raw images of collaborative classroom videos. The proposed system integrates methods for 2D and 3D pose estimation with depth estimation and camera calibration to detect and reconstruct the 3D speaker geometry of a collaborative group of students. Results on the Human3.6M dataset show that the system can estimate 3D poses reasonably well without the need to pre-train on the Human3.6M dataset. Furthermore, for classroom videos, the proposed system outperformed a baseline approach trained on the Human3.6M dataset. The proposed system is used to provide the 3D speaker geometry to a new speaker diarization system that performs well in noisy classroom environments.

SOFI: Multi-Scale Deformable Transformer for Camera Calibration with Enhanced Line Queries

Published in The 35th British Machine Vision Conference 2024, 2024

Camera calibration consists of estimating camera parameters such as the zenith vanishing point and horizon line. Estimating the camera parameters allows other tasks like 3D rendering, artificial reality effects, and object insertion in an image. Transformer-based models have provided promising results; however, they lack cross-scale interaction. In this work, we introduce multi-Scale defOrmable transFormer for camera calibratIon with enhanced line queries, SOFI. SOFI improves the line queries used in CTRL-C and MSCC by using both line content and line geometric features. Moreover, SOFI’s line queries allow transformer models to adopt the multi-scale deformable attention mechanism to promote cross-scale interaction between the feature maps produced by the backbone. SOFI outperforms existing methods on the Google Street View, Horizon Line in the Wild, and Holicity datasets while keeping a competitive inference speed. Code is available at: https://github.com/SebastianJanampa/SOFI

Download Paper

DT-LSD: Deformable Transformer-based Line Segment Detection

Published in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2025'', 2025

Line segment detection is a fundamental low-level task in computer vision, and improvements in this task can impact more advanced methods that depend on it. Most new methods developed for line segment detection are based on Convolutional Neural Networks (CNNs). Our paper seeks to address challenges that prevent the wider adoption of transformer-based methods for line segment detection. More specifically, we introduce a new model called Deformable Transformer-based Line Segment Detection (DT-LSD) that supports cross-scale interactions and can be trained quickly. This work proposes a novel Deformable Transformer-based Line Segment Detector (DT-LSD) that addresses LETR’s drawbacks. For faster training, we introduce Line Contrastive DeNoising (LCDN), a technique that stabilizes the one-to-one matching process and speeds up training by 34×. We show that DT-LSD is faster and more accurate than its predecessor transformer-based model (LETR) and outperforms all CNN-based models in terms of accuracy. In the Wireframe dataset, DT-LSD achieves 71.7 for sAP10 and 73.9 for sAP15; while 33.2 for sAP10 and 35.1 for sAP15 in the YorkUrban dataset. The code is available at https://github.com/SebastianJanampa/DT-LSD

Download Paper

DETRPose: Real-time end-to-end transformer model for multi-person pose estimation

Published in under review, 2025

Multi-person pose estimation (MPPE) estimates keypoints for all individuals present in an image. MPPE is a fundamental task for several applications in computer vision and virtual reality. Unfortunately, there are currently no transformer-based models that can perform MPPE in real time. The paper presents a family of transformer-based models capable of performing multi-person 2D pose estimation in real-time. Our approach utilizes a modified decoder architecture and keypoint similarity metrics to generate both positive and negative queries, thereby enhancing the quality of the selected queries within the architecture. Compared to state-of-the-art models, our proposed models train much faster, using 5 to 10 times fewer epochs, with competitive inference times without requiring quantization libraries to speed up the model. Furthermore, our proposed models provide competitive results or outperform alternative models, often using significantly fewer parameters. The code is available at https://github.com/SebastianJanampa/DETRPose

Download Paper

LINEA: Fast and Accurate Line Detection Using Scalable Transformers

Published in IEEE International Conference on Image Processing 2025, 2025

Line detection is a basic digital image processing operation used by higher-level processing methods. Recently, transformer-based methods for line detection have proven to be more accurate than methods based on CNNs, at the expense of significantly lower inference speeds. As a result, video analysis methods that require low latencies cannot benefit from current transformer-based methods for line detection. In addition, current transformer-based models require pretraining attention mechanisms on large datasets (e.g., COCO or Object360). This paper develops a new transformer-based method that is significantly faster without requiring pretraining the attention mechanism on large datasets. We eliminate the need to pre-train the attention mechanism using a new mechanism, Deformable Line Attention (DLA). We use the term LINEA to refer to our new transformer-based method based on DLA. Extensive experiments show that LINEA is significantly faster and outperforms previous models on sAP in out-of-distribution dataset testing. The code is available at https://github.com/SebastianJanampa/LINEA.

Download Paper

teaching

Mentor for Physics 1 and Mathematics 1

Undergraduate course, Universidad de Ingenieria y Tecnologia, Department of Student Wellness, 2018

  • Duties
    • Teach Physics 1 and Mathematics 1 classes to students at academic risk.
  • Period: April 2018 - May 2020

Teaching Assistant for Science Courses

Undergraduate course, Universidad de Ingenieria y Tecnologia, Department of Science, 2019

  • Duties
    • Clear students’ doubts during the exams.
  • Period: Mar 2019 - Jul 2019

Teaching Assistant for ECE 131L Programming Fundamentals

Undergraduate course, The University of New Mexico, Department of Electrical & Computer Engineering, 2022

  • Duties
    • Teach programming concepts, including functions, arrays, pointers, and programming in the Linux environment.
    • Combine digital image processing introductory topics with C programming.
  • Period: Jan 2022 - Dec 2023