PhyCV

PhyCV is the first computer vision library which utilizes algorithms directly derived from the equations of physics governing physical phenomena. The algorithms appearing in the first release emulate the propagation of light through a physical medium with natural and engineered diffractive properties followed by coherent detection. Unlike traditional algorithms that are a sequence of hand-crafted empirical rules, physics-inspired algorithms leverage physical laws of nature as blueprints. In addition, these algorithms can, in principle, be implemented in real physical devices for fast and efficient computation in the form of analog computing.[1] Currently PhyCV has three algorithms, Phase-Stretch Transform (PST) and Phase-Stretch Adaptive Gradient-Field Extractor (PAGE), and Vision Enhancement via Virtual diffraction and coherent Detection (VEViD). All algorithms have CPU and GPU versions. PhyCV is now available on GitHub and can be installed from pip.

Real-time edge detection at 35 frames per second using PhyCV on a 4K video (frame size 3840x2160).
Retina vessel detection using PST in PhyCV.
Directional edge detection of a sunflower using PAGE in PhyCV.
Low-light enhancement using VEViD in PhyCV.
Color enhancement using VEViD in PhyCV.

History

Algorithms in PhyCV are inspired by the physics of the photonic time stretch[2][3] (a hardware technique for ultrafast and single-shot data acquisition). PST is an edge detection algorithm that was open-sourced in 2016 and has 800+ stars and 200+ forks on GitHub. PAGE is a directional edge detection algorithm that was open-sourced in February, 2022. PhyCV was originally developed and open-sourced by Jalali-Lab @ UCLA in May 2022. In the initial release of PhyCV, the original open-sourced code of PST and PAGE is significantly refactored and improved to be modular, more efficient, GPU-accelerated and object-oriented. VEViD is a low-light and color enhancement algorithm that was added to PhyCV in November 2022.

Background

Phase-Stretch Transform (PST)

Phase-Stretch Transform (PST) is a computationally efficient edge and texture detection algorithm with exceptional performance in visually impaired images.[4][5][6] The algorithm transforms the image by emulating propagation of light through a device with engineered diffractive property followed by coherent detection. It has been applied in improving the resolution of MRI image,[7] extracting blood vessels in retina images,[8] dolphin identification,[9] and waste water treatment,[10] single molecule biological imaging,[11] and classification of UAV using micro Doppler imaging.[12]

Phase-Stretch Adaptive Gradient-Field Extractor (PAGE)

Phase-Stretch Adaptive Gradient-Field Extractor (PAGE) is a physics-inspired algorithm for detecting edges and their orientations in digital images at various scales.[13][14] The algorithm is based on the diffraction equations of optics. Metaphorically speaking, PAGE emulates the physics of birefringent (orientation-dependent) diffractive propagation through a physical device with a specific diffractive structure. The propagation converts a real-valued image into a complex function. Related information is contained in the real and imaginary components of the output. The output represents the phase of the complex function.

Vision Enhancement via Virtual diffraction and coherent Detection (VEViD)

The YOLO-v3 object detector is improved by VEViD in PhyCV.

Vision Enhancement via Virtual diffraction and coherent Detection (VEViD) a efficient and interpretable low-light and color enhancement algorithm that reimagines a digital image as a spatially varying metaphoric light field and then subjects the field to the physical processes akin to diffraction and coherent detection.[15] The term “Virtual” captures the deviation from the physical world. The light field is pixelated and the propagation imparts a phase with an arbitrary dependence on frequency which can be different from the quadratic behavior of physical diffraction. VEViD can be further accelerated through mathematical approximations that reduce the computation time without appreciable sacrifice in image quality. A closed-form approximation for VEViD which we call VEViD-lite can achieve up to 200 FPS for 4K video enhancement.

PhyCV on the Edge

Featuring low-dimensionality and high-efficiency, PhyCV is ideal for edge computing applications. In this section, we demonstrate running PhyCV on NVIDIA Jetson Nano in real-time.

NVIDIA Jetson Nano Developer Kit

NVIDIA Jetson Nano Developer Kit is a small- sized and power-efficient platform for edge computing applications. It is equipped with an NVIDIA Maxwell architecture GPU with 128 CUDA cores, a quad-core ARM Cortex-A57 CPU, 4GB 64-bit LPDDR4 RAM, and supports video encoding and decoding up to 4K resolution. Jetson Nano also offers a variety of interfaces for connectivity and expansion, making it ideal for a wide range of AI and IoT applications. In our setup, we connect a USB camera to the Jetson Nano to acquire videos and demonstrate using PhyCV to process the videos in real-time.

PhyCV real-time low-light enhancement using Jetson Nano
PhyCV real-time edge detection using Jetson Nano

Real-time PhyCV on Jetson Nano

We use the Jetson Nano (4GB) with NVIDIA JetPack SDK version 4.6.1, which comes with pre- installed Python 3.6, CUDA 10.2, and OpenCV 4.1.1. We further install PyTorch 1.10 to enable the GPU accelerated PhyCV. We demonstrate the results and metrics of running PhyCV on Jetson Nano in real-time for edge detection and low-light enhancement tasks. For 480p videos, both operations achieve beyond 38 FPS, which is sufficient for most cameras that capture videos at 30 FPS. For 720p videos, PhyCV low-light enhancement can operate at 24 FPS and PhyCV edge detection can operate at 17 FPS.

Running time (per frame) on Jetson Nano
PhyCV Edge Detection PhyCV Low-light Enhancement
480p (640 x 480) 25.9 ms 24.5 ms
720p (1280 x 720) 58.5 ms 41.1 ms

Highlights

Modular Code Architecture

Modular code architecture of PhyCV algorithms.

The code in PhyCV has a modular design which faithfully follows the physical process from which the algorithm was originated. Both PST and PAGE modules in the PhyCV library emulate the propagation of the input signal (original digital image) through a device with engineered diffractive property followed by coherent (phase) detection. The dispersive propagation applies a phase kernel to the frequency domain of the original image. This process has three steps in general, loading the image, initializing the kernel and applying the kernel. In the implementation of PhyCV, each algorithm is represented as a class in Python and each class has methods that simulate the steps described above. The modular code architecture follows the physics behind the algorithm. Please refer to the source code on GitHub for more details.

GPU Acceleration

PhyCV supports GPU acceleration. The GPU versions of PST and PAGE are built on PyTorch accelerated by the CUDA toolkit. The acceleration is beneficial for applying the algorithms in real-time image video processing and other deep learning tasks. The running time per frame of PhyCV algorithms on CPU (Intel i9-9900K) and GPU (NVIDIA TITAN RTX) for videos at different resolutions are shown below. Note that the PhyCV low-light enhancement operates in the HSV color space, so the running time also includes RGB to HSV conversion. However, for all running times using GPUs, we ignore the time of moving data from CPUs to GPUs and count the algorithm operation time only.

PhyCV - PST Edge Detection
CPU GPU
1080p (1920x1080) 550 ms 4.6 ms
2K (2560 x 1440) 1000 ms 8.2 ms
4K (3840 x 2160) 2290 ms 18.5 ms
PhyCV - PAGE Directional Edge Detection (10 directions)
CPU GPU
1080p (1920x1080) 2800 ms 48.5 ms
2K (2560 x 1440) 5000 ms 87 ms
4K (3840 x 2160) 11660 ms 197 ms
PhyCV - VEViD Low-light Enhancement
CPU GPU
1080p (1920x1080) 175 ms 4.3 ms
2K (2560 x 1440) 320 ms 7.8 ms
4K (3840 x 2160) 730 ms 17.9 ms
PhyCV - VEViD-lite Low-light Enhancement
CPU GPU
1080p (1920x1080) 60 ms 2.1 ms
2K (2560 x 1440) 110 ms 3.5 ms
4K (3840 x 2160) 245 ms 7.4 ms

Installation and Examples

Please refer to the GitHub README file for a detailed technical documentation.

Current Limitations

I/O (Input/Output) Bottleneck for Real-time Video Processing

When dealing with real-time video streams from cameras, the frames are captured and buffered in CPU and have to be moved to GPU to run the GPU-accelerated PhyCV algorithms. This process is time-consuming and it is a common bottleneck for real-time video-processing algorithms.

Lack of Parameter Adaptivity for Different Images

Currently, the parameters of PhyCV algorithms have to be manually tuned for different images. Although a set of pre-selected parameters work relatively well for a wide range of images, the lack of parameter adaptivity for different images remains a limitation for now.

See also

References

  1. Physics-based Feature Engineering. Jalali et al. Optics, Photonics and Laser Technology, 2019
  2. Time-stretched analogue-to-digital conversion. Bhushan et al. Electronic Letters techniques, 1998
  3. Time stretch and its applications. Mahjoubfar et al. Nature Photonics, 2017
  4. Physics-inspired image edge detection. Asghari et al. IEEE Global Signal and Information Processing Symposium, 2014
  5. Edge detection in digital images using dispersive phase stretch. Asghari et al. International Journal of Biomedical Imaging, 2015
  6. Feature Enhancement in Visually Impaired Images. Suthar et al. IEEE Access, 2018
  7. Fast Super-Resolution in MRI Images Using Phase Stretch Transform, Anchored Point Regression and Zero-Data Learning. He et al. IEEE International Conference on Image Processing, 2019
  8. A local flow phase stretch transform for robust retinal vessel detection. Challoob et al. In International Conference on Advanced Concepts for Intelligent Vision Systems, 2020
  9. Dolphin Identification Method Based on Improved PST. Wang et al. In 2021 IEEE/ACIS 6th International Conference on Big Data, Cloud Computing, and Data Science (BCD), 2021
  10. Image segmentation of activated sludge phase contrast images using phase stretch transform. Ang et al. Microscopy, 2019
  11. Phase stretch transform for super-resolution localization microscopy. Ilovitsh et al. In Biomedical Optics Express, 2016
  12. Classification of Drones Using Edge-Enhanced Micro-Doppler Image Based on CNN. Singh et al. Traitement du Signal, 2021
  13. Phase-stretch adaptive gradient-field extractor (page). Suthar et al. Coding Theory, 2020
  14. Phase-Stretch Adaptive Gradient-Field Extractor (PAGE). MacPhee et al. arXiv preprint arXiv:2202.03570, 2022
  15. VEViD: Vision Enhancement via Virtual diffraction and coherent Detection. Jalali et al. eLight, 2022
    This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.