• fidotron 3 hours ago

    This is one of the areas the Raspberry Pi ecosystem is not as strong as either Android or random Chinese devices, so it's good to see them doing this.

    For example https://wiki.sipeed.com/soft/maixpy/en/develop_kit_board/mai... has been around for years, and is far far better than it should be, although the resolutions supported by the Pi AI camera are way higher.

    The west has been very slow to get into the whole edge inference game.

    • asenna 2 hours ago

      That MaixCube looks cool. Do you know of any similar products with a much higher resolution?

    • gardnr 5 hours ago

      Mostly classification and object detection. Looks like segmentation and pose estimation as well.

      Here's the page for the chip: https://developer.sony.com/imx500

      And the model zoo: https://github.com/raspberrypi/imx500-models

      • teruakohatu 5 hours ago

        From the article:

        > Using Sony’s suite of AI tools, existing neural network models using frameworks such as TensorFlow or PyTorch can be converted to run efficiently on the AI Camera. Alternatively, new models can be designed to take advantage of the AI accelerator’s specific capabilities.

        From Sony's website [1]

        > Deploy your AI model to IMX500 in three steps.

        > 1. Convert your own quantized and trained deep neural network model into an optimized binary file.

        > 2. Package your AI model as a cryptographically signed package.

        > 3. Deploy your packaged model securely to IMX500.

        and [2]

        > Memory size 8388480 bytes for firmware, network weight file, and working memory

        8.39mb sounds a lot smaller than 8388480 bytes. It is not clear how much is available for the model and how much memory and firmware.

        Does anyone know how well these really quantized image models perform? I have never tried or needed to reduce a model to <8.4 mb.

        [1] https://developer.sony.com/imx500

        [2] https://developer.sony.com/imx500/imx500-key-specifications

        • fidotron 3 hours ago

          > Does anyone know how well these really quantized image models perform?

          IME it's better if they are quantized during training/refinement as it can definitely affect model performance if not.

          > I have never tried or needed to reduce a model to <8.4 mb.

          Same. The limit has been inference speed before model size. It's quite unusual to run video inference at full resolution due to this (having spent too much of my life up-scaling segmentation masks) but maybe their accelerator is actually capable of that speed? They don't seem to say what resolution of image their model is optimized for.

          EDIT: The Sony specs say input tensor size is 640x480.

          • hcfman 2 hours ago

            Really terribly. To put it into context, from memory, yolov3 tiny model was around 6 million parameters. How many parameters will this model be using? Indeed, no one is saying what it is quantized to, but you can be pretty sure that it's 8 bits.

            I deployed yolov3 tiny model back in 2019 with 32 bit weights. It thought that the drain pipe in the driveway was a person.

            Whenever people show hype about this this show pre-selected images where there are no candidates for false positives. In reality you would not want to be woken in the middle of the night on the output of these things.

          • CalRobert 4 hours ago

            "8.39mb sounds a lot smaller than 8388480 bytes. "

            I don't follow?

            (Assuming you meant MB and not millibits - if the latter I definitely agree!)

          • fisian 3 hours ago

            I find cameras with embedded processing are really interesiting. I have used the PixieCam before, for on camera image processing and object detection.

            One big benefit is that there are less requirements for the communication interface (e.g. lower bandwidth). Additionally, it makes it a lot easier for embedded applications, e.g. some low power microcontroller that wouldn't have the processing power for image processing itself.

            • cogogo 2 hours ago

              What are the common use cases for this? I’m lacking the imagination to extend the demos to something practical.

              • undefined 2 hours ago
                [deleted]