Purpose: Range from specific tasks like GUI generation to general purpose programming assistance.
Input Modalities: Accept natural language, GUI selection, API specifications, sample inputs or outputs etc.
Programming Languages: Popular ones support Python, JavaScript, Java, C#, PHP etc. Some handle multiple.
Code Quality: Early tools focused on proof-of-concepts. Latest emphasize readability, correctness and compliance.
Model Size: Affects capabilities – smaller focused, extensive handle diverse tasks. Lab models excel in research.
Interactivity: Varied levels from fully automatic to suggestion-based iterative workflow for refinement.
Explainability: Understanding reasoning helps debug errors and improves trustworthiness of AI-generated code.
Applications: Prototyping, automation, education, assisted coding, documentation generation etc.
Sample Tools: Anthropic Codex, GitHub Copilot, Deeplite, Tabnine, Algorithmia, AI21 Labs, Outsight.ai etc.
Future Advancements: Progressing towards deeper understanding of domain semantics, conventions, long-form projects involving multiple workflows.
Purpose: To extract semantic textual descriptions, captions, summaries from video content for tasks like searchability, accessibility.
Input modalities: Support common video formats like MP4, AVI, MOV along with streaming or URL based inputs.
Output formats: Provide output text in formats like SRT, WebVTT, JSON, XML suitable for various usages.
Speed: Real-time APIs offered by Google and Microsoft for streaming needs, batch processing by AWS and IBM suited for large video libraries.
Accuracy: Depends on clarity of speech, ambient noise, video quality. Introducing contextual clues helps improve comprehension.
Language support: Popular tools cover many languages but quality varies significantly across languages and domains.
Model customization: Few enable domain and task-specific fine-tuning while others rely on general pre-trained models.
Applications: Search, accessibility, social media content moderation, education, media monitoring, law enforcement etc.
Future scope: Handling videos with multiple overlapped speakers, Sign language translation, facial expression understanding, abstraction and summarization needs progress.
Purpose: Range from interactive experimentation (DeepArt, NeuralStyle) to industrial applications (StyleGAN, Pixray)
Input modalities: Photos are commonly transferred onto styles extracted from artwork, photos or other modalities like drawings or textures
Control level: Some only apply random styles (NeuralStyle) while others enable guided control (Pixray, ArtsAugmented)
Pretrained models: Commonly used pretrained models are StyleGANv2, VGG network. Custom models can also be developed
Style sources: Styles can be arbitrary images or curated datasets across art forms, artists, genres, etc
Output quality: Research tools focus on ideas while industrial ones ensure high commercial product quality
Speed: Mobile apps are optimized for real-time use while others require GPU processing
Generalization: Ability to transfer styles unseen during training while retaining semantic content
Future scope: Advancing towards fine-grained control over spatial and semantic style attributes within and across domains
Purpose – Tools vary in purpose from composition (Jukebox, Magenta), generation of elements (MuseNet, Magenta), interactive music creation (Amper), and commercial song production (AIVA, Jukedeck).
Input modalities – Tools accept different inputs like text (Jukebox), audio MIDI (MuseNet), lyrics, melody, or analyze existing songs (AIVA).
Output quality – Open-source tools focus on ideas, commercial tools ensure polished outputs suitable for production IP needs. Quality also depends on model size.
Genre handling – Generalist tools generate many styles but specialist tools offer more control within genres like classical (MuseNet).
Interactivity – Level of user control differs, from low (AIVA) to high interactivity and refinement loops (Jukedeck, Amper, Form).
Flexibility – Tools allowing diverse inputs (text in Jukebox) and iterative experimentation are best for exploration.
Development status – Rapidly evolving field, early tools proved concepts, latest generate coherent longer works.
Applications – Ideation, education, collaboration, IP generation, accessibility, entertainment across music, other arts.
Future scope – Interpretability of emotion, directionality in outputs, style transfer between domains, generation tied to human affect remain open problems.