GNU/Linux Desktop Survival Guide
by Graham Williams |
|||||
MLHub Pipelines |
20201024 A general mlhub philosophy is that the output from a command should be, for example, a well defined text format. Typically this will use a csv (comma separated value) format and will be consistent so that follow-on processes within a pipeline can further process the results. These might even be other mlhub models. The mlhub commands focus on their specific task, not solving all problems, but implementing their specific task well. We can then leave extra processing to other specialist tools, like sed, or cut, and awk.
This example deploys an optical character recognition capability from the ocr command of the azcv model:
$ ml ocr azcv ~/.mlhub/azcv/cache/images/mycat.png | head -2 51.0 43.0 668.0 51.0 667.0 85.0 51.0 77.0,My cats name is freckles. She like's to climb up 37.0 97.0 691.0 104.0 690.0 134.0 37.0 128.0,high. She is 2 years old. She likes to play a... $ ml ocr azcv ~/.mlhub/azcv/cache/images/mycat.png | head -2 | sed 's/,/\t/' 51.0 43.0 668.0 51.0 667.0 85.0 51.0 77.0 My cats name is freckles. She like's to cl... 37.0 97.0 691.0 104.0 690.0 134.0 37.0 128.0 high. She is 2 years old. She likes to pla... |
If you do not care for the bounding boxes that is output by default from the ocr command then simply remove them using cut:
$ ml ocr azcv ~/.mlhub/azcv/cache/images/mycat.png | head -2 | cut -d, -f2- My cats name is freckles. She like's to climb up high. She is 2 years old. She likes to play a lot of games. |
We can process every jpg image file in a directory where we may have several hundred files. We will save the text output into a txt file. The following pipeline utilises a for loop, an ml model, and the cut command:
$ for f in images/*.jpg; do echo "=====> $f"; ml ocr azcv $f | cut -d, -f2- > $(dirname $f)/$(basename $f .jpg).txt; done |
Here we transcribe spoken English into text and then translate that text into Persian (Farsi) using azspeech2txt and aztranslate:
$ ml transcribe azspeech2txt friend.wav | ml translate aztranslate --to=fa en,1.0,fa,... |
A compelling example of a pipeline is to transcribe our English utterances, translate to French and then synthesise into a female French voice using a combination of azspeech and aztranslate. Here it is:
$ ml transcribe azspeech | ml translate aztranslate --to=fr | cut -d',' -f4- | ml synthesize azspeech --voice=fr-FR-HortenseRUS |