Positron Emission Tomography (PET) is a key imaging technique in clinical practice offering unique functional insights. However, its broader applicability is limited by radiation exposure and lengthy scan times, and reducing either degrades image quality by increasing noise. Over recent years, U-Net-based convolutional neural networks have become dominant approaches for PET denoising. These methods show good performance in denoising, but they offer limited uncertainty handling due to their deterministic nature and are prone to oversmoothing. This paper critically evaluates the applicability of denoising diffusion probabilistic models (DDPMs) for low-dose PET image denoising, particularly when training data is limited. We adapted a conditional DDPM architecture to the PET context and compared its performance to a U-Net baseline on clinical PET data. The conditional DDPM was both deployed as a standalone method and in a hybrid-ensemble format where it was placed in series with the baseline U-Net. The conditional DDPM, while promising in theory, introduced unrealistic features in some outputs and was computationally intensive. The hybrid model, intended to combine both strengths, underperformed due to high sample variability in DDPM outputs. Contrary to recent studies suggesting DDPM superiority, our experiments demonstrate that DDPMs underperform relative to U-Net, showing inferior PSNR and SSIM and introducing notable artifacts. Our findings highlight the importance of training dataset size and quality for DDPM effectiveness and provide practical guidelines regarding the trustworthiness of diffusion models for clinical PET denoising.