The goal of the thesis is to investigate and implement methods for predicting oriented 3D bounding boxes from a single monocular RGB image using deep learning approaches. This includes estimating 3D position, physical dimensions and 3D orientation. The work builds on current research in monocular 3D object detection and addresses typical challenges such as scale ambiguity, occlusion and viewpoint variability.