✅ Part 1: Multimodal AI 1. What does “multimodal AI” refer to? A. AI using multiple CPUs B. AI trained on a single dataset C. AI that...
✅ Part 1: Multimodal AI
1. What does “multimodal AI” refer to?
A. AI using multiple CPUs
B. AI trained on a single dataset
C. AI that processes multiple types of input (text, image, audio, etc.)
D. AI for internet browsing
✅ Correct Answer: C
Explanation: Multimodal AI integrates different data types (modalities) like text, images, and sound.
2. Which of these is an example of a multimodal task?
A. Text translation
B. Image captioning
C. Sentiment analysis
D. Grammar correction
✅ Correct Answer: B
Explanation: Image captioning requires understanding an image and generating text.
3. Which popular model architecture is used in many multimodal systems?
A. CNN only
B. RNN only
C. Transformer
D. SVM
✅ Correct Answer: C
Explanation: Transformers power most current multimodal models like GPT-4, Flamingo, etc.
4. CLIP by OpenAI connects:
A. Audio and video
B. Images and text
C. Text and code
D. Speech and text
✅ Correct Answer: B
Explanation: CLIP aligns image embeddings with text descriptions.
5. Which company developed the Flamingo multimodal model?
A. Google
B. Meta
C. DeepMind
D. IBM
✅ Correct Answer: C
6. One challenge of multimodal AI is:
A. Faster computation
B. Handling low-dimensional data
C. Aligning information from different modalities
D. Limiting output
✅ Correct Answer: C
7. Multimodal learning improves:
A. Only computation speed
B. Single-modality performance
C. Generalization across data types
D. Overfitting
✅ Correct Answer: C
8. Multimodal fusion refers to:
A. Compressing files
B. Merging features from different modalities
C. Copying models
D. Model pruning
✅ Correct Answer: B
9. Visual Question Answering (VQA) is a task that uses:
A. Text only
B. Image only
C. Text + Image
D. Text + Audio
✅ Correct Answer: C
10. Which of the following is NOT a modality in AI?
A. Text
B. Image
C. Compiler
D. Audio
✅ Correct Answer: C
11. The term “cross-modal” means:
A. Switching computers
B. Using same input
C. Connecting different types of data
D. Data compression
✅ Correct Answer: C
12. An example of multimodal generation is:
A. Drawing with a mouse
B. Creating a video from text and audio
C. Typing in a Word document
D. Saving files
✅ Correct Answer: B
13. DALL·E is an example of:
A. Text-to-image generation
B. Audio synthesis
C. Text summarization
D. Chatbot
✅ Correct Answer: A
14. Which model handles image and text simultaneously?
A. GPT-3
B. BERT
C. CLIP
D. LLaMA
✅ Correct Answer: C
15. In Multimodal AI, “alignment” means:
A. Sorting files
B. Combining sensors
C. Matching inputs from different data types
D. Reducing latency
✅ Correct Answer: C
16. A model that inputs speech and outputs image is:
A. Single-modal
B. Bimodal
C. Trimodal
D. Monolingual
✅ Correct Answer: B
17. Which of the following is a use of multimodal AI?
A. CAPTCHA solving
B. File renaming
C. Multilingual sorting
D. Multimodal translation (video + audio)
✅ Correct Answer: D
18. Zero-shot multimodal learning means:
A. No AI used
B. Training from scratch
C. Model can generalize to new tasks without training on them
D. Image compression
✅ Correct Answer: C
19. Combining multiple modality outputs is called:
A. Fusion
B. Scaling
C. Pruning
D. Gradient descent
✅ Correct Answer: A
20. One main limitation of early multimodal systems:
A. Too much RAM
B. GPU-free processing
C. Lack of alignment across modalities
D. Fast inference
✅ Correct Answer: C
21. Multimodal models are often trained using:
A. Handwritten rules
B. Text files only
C. Paired data (e.g., image + caption)
D. Unpaired sound files
✅ Correct Answer: C
22. What is a “multimodal embedding space”?
A. A space where only sound is stored
B. Mapping different modalities into a shared vector space
C. A database
D. A memory cache
✅ Correct Answer: B
23. Speech recognition is typically a task combining:
A. Text and logic
B. Audio and text
C. Image and video
D. Video and sensor
✅ Correct Answer: B
24. LLMs like GPT-4 with vision are called:
A. Audio-based
B. Single-modality models
C. Multimodal LLMs
D. CNN-based
✅ Correct Answer: C
25. Multimodal AI can best be applied in:
A. Search engines
B. Visual captioning, robotics, AVs, assistive tech
C. Compilers
D. Spreadsheets
✅ Correct Answer: B
✅ Part 2: AI Reasoning
26. AI reasoning refers to the ability to:
A. Speak foreign languages
B. Perform logical inference or make decisions
C. Upload data
D. Hear sounds
✅ Correct Answer: B
27. Which logic is mostly used in classical reasoning systems?
A. Arithmetic logic
B. First-order logic
C. Game logic
D. Signal logic
✅ Correct Answer: B
28. A reasoning agent must:
A. Wait for user input
B. Guess randomly
C. Derive new facts from known facts
D. Store all rules only
✅ Correct Answer: C
29. Deductive reasoning means:
A. Guessing results
B. Inferring specific facts from general rules
C. Explaining emotions
D. Visual perception
✅ Correct Answer: B
30. Inductive reasoning involves:
A. Inferring general rules from specific observations
B. Deduction
C. Writing rules
D. Image transformation
✅ Correct Answer: A
31. Abductive reasoning is about:
A. Always proving truth
B. Finding the best explanation for observations
C. Math puzzles
D. Code generation
✅ Correct Answer: B
32. Which of the following is not a form of reasoning?
A. Deductive
B. Recursive
C. Inductive
D. Abductive
✅ Correct Answer: B
33. Which tool is used for logic-based reasoning in AI?
A. SQL
B. Prolog
C. TensorFlow
D. HTML
✅ Correct Answer: B
34. The difference between knowledge and reasoning is:
A. None
B. Knowledge is passive, reasoning is active application
C. Both are passive
D. Reasoning is memory
✅ Correct Answer: B
35. The Turing test evaluates:
A. Visual reasoning
B. Mathematical logic
C. AI reasoning and human-like intelligence
D. Audio generation
✅ Correct Answer: C
36. Symbolic AI reasoning is:
A. Based on neural networks
B. Based on hand-coded symbols and logic
C. Non-reasoning AI
D. Compression model
✅ Correct Answer: B
37. Reasoning in uncertain environments uses:
A. First-order logic only
B. Probability and fuzzy logic
C. Deterministic rules only
D. GPU-based solutions
✅ Correct Answer: B
38. In AI, a knowledge base stores:
A. Books
B. Environmental data only
C. Facts and rules
D. Tables only
✅ Correct Answer: C
39. Forward chaining starts from:
A. Goal
B. Data/facts
C. End result
D. No input
✅ Correct Answer: B
40. Backward chaining starts from:
A. Goal
B. Known facts
C. No input
D. Inputs only
✅ Correct Answer: A
41. Rule-based systems use:
A. Random outputs
B. If-then rules
C. Quantum gates
D. Mathematical proofs only
✅ Correct Answer: B
42. Semantic reasoning is mostly applied in:
A. Search engine optimization
B. NLP and knowledge graphs
C. File transfer
D. Gaming
✅ Correct Answer: B
43. Reasoning under uncertainty is addressed by:
A. Boolean logic
B. Decision trees and probabilistic models
C. Regex
D. Arithmetic
✅ Correct Answer: B
44. Which reasoning approach does Bayesian Network follow?
A. Deterministic
B. Probabilistic
C. Reactive
D. Deductive only
✅ Correct Answer: B
45. The AI technique that mimics human reasoning is:
A. Symbolic logic only
B. Deep learning
C. Cognitive reasoning or symbolic + neural hybrid
D. Sorting
✅ Correct Answer: C
46. Non-monotonic reasoning allows:
A. No conclusions
B. Revising beliefs when new info arrives
C. Permanent memory
D. Ignoring facts
✅ Correct Answer: B
47. Commonsense reasoning is difficult for AI because:
A. It’s mathematical
B. Requires huge storage
C. Involves implicit knowledge humans take for granted
D. Easy to compute
✅ Correct Answer: C
48. Reasoning in expert systems is often done via:
A. Statistics
B. Knowledge base + inference engine
C. HTML
D. JSON
✅ Correct Answer: B
49. One main difference in reasoning vs learning is:
A. Reasoning doesn’t need data
B. Learning is based on data, reasoning is logic
C. Both are same
D. Reasoning uses charts
✅ Correct Answer: B
50. What is a common goal of AI reasoning?
A. Store more images
B. Derive new knowledge from known knowledge
C. Create documents
D. Design websites
✅ Correct Answer: B
COMMENTS