Шукаєте відповіді та рішення тестів для FIT3182 Big data management and processing - MUM S1 2026? Перегляньте нашу велику колекцію перевірених відповідей для FIT3182 Big data management and processing - MUM S1 2026 в learning.monash.edu.
Отримайте миттєвий доступ до точних відповідей та детальних пояснень для питань вашого курсу. Наша платформа, створена спільнотою, допомагає студентам досягати успіху!
With reference to the Spark Structured Streaming programming model, as illustrated in the figure below, the "Output" is defined as what gets written out to the external storage. The output can be defined in three different modes.
Which of the following is true with reference to the aforementioned modes?
i) In Complete Mode, the entire updated Result Table will be written to the external storage. ii) In Append Mode, only the new rows appended in the Result Table since the last trigger will be written to the external storage.iii) In Update Mode, only the rows that were updated in the Result Table since the last trigger will be written to the external storage.iv) Update Mode is equivalent to Append Mode if the query contains aggregation.
How should you avoid mutable, growing arrays in the following schema?
{
name: "O'Reilly Media",
founded: 1980,
location: "CA",
books: [12346789, 234567890, ...]
}
{
_id: 123456789,
title: "MongoDB: The Definitive Guide",
author: [ "Kristina Chodorow", "Mike Dirolf" ],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English"
}
{
_id: 234567890,
title: "50 Tips and Tricks for MongoDB Developer",
author: "Kristina Chodorow",
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English"
}
Given a customer collection that includes fields for gender and city, which aggregate pipeline shows the number of female customers in each city?
If the address data is frequently retrieved with the name information, how will you modify the following schema representing one to one relationship with referencing?
{
_id: "joe",
name: "Joe Bookreader"
}
{
patron_id: "joe",
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: "12345"
}
Following is the structure of the restaurant collection:
{
"address": {
"building": "1007",
"coord": [
144.946457, -37.840935],
"street": "Monash University Bakery",
"zipcode": "
3800"
},
"suburb": "Clayton",
"cuisine": "Bakery",
"grades": [
{ "date": { "$date": 1393804800000 }, "grade": "A", "score": 2 },
{ "date": { "$date": 1378857600000 }, "grade": "A", "score": 6 },
{ "date": { "$date": 1358985600000 }, "grade": "A", "score": 10 },
{ "date": { "$date": 1322006400000 }, "grade": "A", "score": 9 },
{ "date": { "$date": 1299715200000 }, "grade": "B", "score": 14 }
],
"name": "Monash Bakery Shop",
"_id": "317455"
}
Write a MongoDB query to display the fields -- 'name', 'suburb' and 'zipcode', but exclude the field '_id' for all the documents in the collection.
Following is the structure of the restaurant collection:
{
"address": {
"building": "1007",
"coord": [ 144.946457, -37.840935 ],
"street": "Monash University Bakery",
"zipcode": "3800"
},
"suburb": "Clayton",
"cuisine": "Bakery",
"grades": [
{ "date": { "$date": 1241804700000 }, "grade": "C", "score": 1 },
{ "date": { "$date": 1224857600000 }, "grade": "A", "score": 5 },
{ "date": { "$date": 1261985600000 }, "grade": "B", "score": 9 },
{ "date": { "$date": 1201003200000 }, "grade": "A", "score": 8 },
{ "date": { "$date": 1137705200000 }, "grade": "B", "score": 13 }
],
"name": "Monash Bakery Shop",
"restaurant_id": "317455"
}
Write a MongoDB query to display the first 5 restaurant which is in the suburb Clayton.
Which of the following operations cannot be directly implemented in MongoDB using standard operations?
MongoDB stores data in which format, both internally and over the network?
Which of the following can not be used to reduce the query execution time in MongoDB ?
You have 100 documents in events collection, each containing an array attendees of 50 elements. 20 of those documents have at least one attendee with "VIP": true. Consider two approaches to count total "VIP" attendees across all events: Approach A: $unwind the attendees array, then $match "VIP": true. Approach B: $match documents where at least one attendee is VIP, then $unwind attendees, then $match "VIP": true. How many documents will be output by the $unwind stage in Approach B?