Beginner table-tennis players require constant real-time feedback while learning the fundamental techniques. However, due to various constraints such as the mentor’s inability to be around all the time, expensive sensors and equipment for sports training, beginners are unable to get the immediate real-time feedback they need during training. Sensors have been widely used to train beginners and novices for various skills development, including psychomotor skills. Sensors enable the collection of multimodal data which can be utilised with machine learning to classify training mistakes, give feedback, and further improve the learning outcomes. In this paper, we introduce the Table Tennis Tutor (T3), a multi-sensor system consisting of a smartphone device with its built-in sensors for collecting motion data and a Microsoft Kinect for tracking body position. We focused on the forehand stroke mistake detection. We collected a dataset recording an experienced table tennis player performing 260 short forehand strokes (correct) and mimicking 250 long forehand strokes (mistake). We analysed and annotated the multimodal data for training a recurrent neural network that classifies correct and incorrect strokes. To investigate the accuracy level of the aforementioned sensors, three combinations were validated in this study: smartphone sensors only, the Kinect only, and both devices combined. The results of the study show that smartphone sensors alone perform sub-par than the Kinect, but similar with better precision together with the Kinect. To further strengthen T3’s potential for training, an expert interview session was held virtually with a table tennis coach to investigate the coach’s perception of having a real-time feedback system to assist beginners during training sessions. The outcome of the interview shows positive expectations and provided more inputs that can be beneficial for the future implementations of the T3.