J'essaie de générer une agrégation hiérarchique similaire à la suivante:

 ┌ Product Category 1
 │ ├ Category Aggregate Info
 │ └ Products
 │   ├ Product 1
 │   │ ├ Aggregate Prices
 │   │ ├ Aggregate Quantities Sold
 │   │ └ etc.
 │   └ Product 2
 │     ├ Aggregate Prices
 │     ├ Aggregate Quantities Sold
 │     └ etc.
 │
 └ Product Category 2
   ├ Category Aggregate Info
   └ Products   
     ├ Product 4
     │ ├ Aggregate Prices
     │ ├ Aggregate Quantities Sold
     │ └ etc.
     └ Product 7
       ├ Aggregate Prices
       ├ Aggregate Quantities Sold
       └ etc.

Je ne contrôle pas les catégories de produits (ex. Fruits vs Boissons) ou les produits (ex. Prunes vs Eau). Cela m'empêche de coder en dur ces valeurs telles qu'elles sont définies par les utilisateurs.

Je ne peux contrôler la hiérarchie des valeurs qu'en utilisant l'ordre cellguid.

Jusqu'à présent, j'ai pu générer une liste dont la structure est illustrée ci-dessous.

[
  {
    "record": [
      {
        "cellguid": "N118M2J4",
        "value": "Fruits"
      },
      {
        "cellguid": "V671H8W7",
        "value": "Mango"
      }
    ]
  },
  ... //truncated for brevity
]

Mon défi actuel consiste à convertir cela en agrégats imbriqués contenant les informations suivantes:

{
    "records": [{
            "cellguid": "N118M2J4",
            "value": "Fruits",
            "count": 5,
            "records": [
                {
                    "cellguid": "V671H8W7",
                    "value": "Mango",
                    "count": 2
                },
                {
                    "cellguid": "V671H8W7",
                    "value": "Orange",
                    "count": 1
                },
                {
                    "cellguid": "V671H8W7",
                    "value": "Plum",
                    "count": 1
                },
                {
                    "cellguid": "V671H8W7",
                    "value": "Apple",
                    "count": 1
                }
            ]
        },
        {
            "cellguid": "N118M2J4",
            "value": "Drinks",
            "count": 9,
            "records": [
                {
                    "cellguid": "V671H8W7",
                    "value": "Coca Cola",
                    "count": 3
                },
                {
                    "cellguid": "V671H8W7",
                    "value": "Pepsi",
                    "count": 3
                },
                {
                    "cellguid": "V671H8W7",
                    "value": "Water",
                    "count": 3
                }
            ]
        }
    ]
}

J'utilise MongoDB 4.4.0 sans beaucoup de chance pour obtenir les groupes imbriqués. Toute aide serait très appréciée.

La liste complète est ici:

[
  {
    "record": [
      {
        "cellguid": "N118M2J4",
        "value": "Drinks"
      },
      {
        "cellguid": "V671H8W7",
        "value": "Coca Cola"
      }
    ]
  },
  {
    "record": [
      {
        "cellguid": "N118M2J4",
        "value": "Fruits"
      },
      {
        "cellguid": "V671H8W7",
        "value": "Mango"
      }
    ]
  },
  {
    "record": [
      {
        "cellguid": "N118M2J4",
        "value": "Drinks"
      },
      {
        "cellguid": "V671H8W7",
        "value": "Coca Cola"
      }
    ]
  },
  {
    "record": [
      {
        "cellguid": "N118M2J4",
        "value": "Drinks"
      },
      {
        "cellguid": "V671H8W7",
        "value": "Coca Cola"
      }
    ]
  },
  {
    "record": [
      {
        "cellguid": "N118M2J4",
        "value": "Fruits"
      },
      {
        "cellguid": "V671H8W7",
        "value": "Orange"
      }
    ]
  },
  {
    "record": [
      {
        "cellguid": "N118M2J4",
        "value": "Drinks"
      },
      {
        "cellguid": "V671H8W7",
        "value": "Pepsi"
      }
    ]
  },
  {
    "record": [
      {
        "cellguid": "N118M2J4",
        "value": "Drinks"
      },
      {
        "cellguid": "V671H8W7",
        "value": "Pepsi"
      }
    ]
  },
  {
    "record": [
      {
        "cellguid": "N118M2J4",
        "value": "Drinks"
      },
      {
        "cellguid": "V671H8W7",
        "value": "Pepsi"
      }
    ]
  },
  {
    "record": [
      {
        "cellguid": "N118M2J4",
        "value": "Fruits"
      },
      {
        "cellguid": "V671H8W7",
        "value": "Plum"
      }
    ]
  },
  {
    "record": [
      {
        "cellguid": "N118M2J4",
        "value": "Fruits"
      },
      {
        "cellguid": "V671H8W7",
        "value": "Apple"
      }
    ]
  },
  {
    "record": [
      {
        "cellguid": "N118M2J4",
        "value": "Drinks"
      },
      {
        "cellguid": "V671H8W7",
        "value": "Water"
      }
    ]
  },
  {
    "record": [
      {
        "cellguid": "N118M2J4",
        "value": "Drinks"
      },
      {
        "cellguid": "V671H8W7",
        "value": "Water"
      }
    ]
  },
  {
    "record": [
      {
        "cellguid": "N118M2J4",
        "value": "Drinks"
      },
      {
        "cellguid": "V671H8W7",
        "value": "Water"
      }
    ]
  },
  {
    "record": [
      {
        "cellguid": "N118M2J4",
        "value": "Fruits"
      },
      {
        "cellguid": "V671H8W7",
        "value": "Mango"
      }
    ]
  }
]
1
haroldcampbell 21 févr. 2021 à 14:50

2 réponses

Meilleure réponse

Une approche peut consister à regrouper par categ / itemId, puis par categId

const groupByItemCateg = {
  $group: {
    _id: '$record',
    categ: {
      $first: {
        $first: '$record'
      }
    },
    item: {
      $first: {
        $last: '$record'
      }
    },
    n: { $sum: 1 }
  }
}

const groupByCateg = {
  $group: {
    _id: '$categ',
    cellguid: {
      $first: '$categ.cellguid'
    },
    value: {
      $first: '$categ.value'
    },
    n: { $sum: '$n' },
    records: {
      $push: {
        cellguid: '$item.cellguid',
        value: '$item.value',
        n: '$n'
      }
    }
  }
}

const project = {
  $project: { _id: 0 }
}
printjson(db.products.aggregate([
  groupByItemCateg,
  groupByCateg,
  project
]).toArray())

terrain de jeux


En fait nécessaire pour la généralisation de la hiérarchie imbriquée en k:

  1. Nous pouvons considérer la donnée comme {cellguid, value, records, n}.

Ensuite, nous pouvons faire suivre le document initial (doc:$$ROOT) le long des différentes étapes pour choisir la donnée correspondante en profondeur k.

  1. Je sérialise le champ _id sous forme de chaîne mais tout est bon tant que ce n'est pas un tableau

Le nom de champ _id est réservé pour une utilisation comme clé primaire; sa valeur doit être unique dans la collection, est immuable et peut être de tout type autre qu'un tableau.

(J'ai en fait essayé avec _id: $slice: { ['$doc.record' 0, depth+1] } mais cela donne des résultats incohérents sur local / mongoplayground même si 4.4.3 sur les deux parties)


db.products.remove({})
data=[{"record":[{"cellguid":"N118M2J4","value":"Drinks"},{"cellguid":"V671H8W7","value":"Coca Cola"},{"cellguid":"savour","value":"light"}]},{"record":[{"cellguid":"N118M2J4","value":"Fruits"},{"cellguid":"V671H8W7","value":"Mango"},{"cellguid":"color","value":"yellow"}]},{"record":[{"cellguid":"N118M2J4","value":"Drinks"},{"cellguid":"V671H8W7","value":"Coca Cola"},{"cellguid":"savour","value":"lemon"}]},{"record":[{"cellguid":"N118M2J4","value":"Drinks"},{"cellguid":"V671H8W7","value":"Coca Cola"},{"cellguid":"savour","value":"light"}]},{"record":[{"cellguid":"N118M2J4","value":"Fruits"},{"cellguid":"V671H8W7","value":"Orange"},{"cellguid":"color","value":"orange"}]},{"record":[{"cellguid":"N118M2J4","value":"Drinks"},{"cellguid":"V671H8W7","value":"Pepsi"},{"cellguid":"savour","value":"light"}]},{"record":[{"cellguid":"N118M2J4","value":"Drinks"},{"cellguid":"V671H8W7","value":"Pepsi"},{"cellguid":"savour","value":"MAX"}]},{"record":[{"cellguid":"N118M2J4","value":"Drinks"},{"cellguid":"V671H8W7","value":"Pepsi"},{"cellguid":"savour","value":"crystal"}]},{"record":[{"cellguid":"N118M2J4","value":"Fruits"},{"cellguid":"V671H8W7","value":"Plum"},{"cellguid":"color","value":"yellow"}]},{"record":[{"cellguid":"N118M2J4","value":"Fruits"},{"cellguid":"V671H8W7","value":"Apple"},{"cellguid":"color","value":"golden"}]},{"record":[{"cellguid":"N118M2J4","value":"Drinks"},{"cellguid":"V671H8W7","value":"Water"},{"cellguid":"savour","value":"beer"}]},{"record":[{"cellguid":"N118M2J4","value":"Drinks"},{"cellguid":"V671H8W7","value":"Water"},{"cellguid":"savour","value":"ale"}]},{"record":[{"cellguid":"N118M2J4","value":"Drinks"},{"cellguid":"V671H8W7","value":"Water"},{"cellguid":"savour","value":"ale"}]},{"record":[{"cellguid":"N118M2J4","value":"Fruits"},{"cellguid":"V671H8W7","value":"Mango"},{"cellguid":"color","value":"yellow"}]}]
db.products.insert(data)

const id = (depth, arrField='$doc.record') => ({
  $reduce: {
    input: { $slice: [arrField, 0, depth+1] },
    initialValue: '',
    in: {
      $concat: ['$$value', '$$this.cellguid', '$$this.value']
    }
  }
})
const initialize = depth => ({
  $group: {
    _id: id(depth, '$record'),
    datum: {
      $first: {
        $last: '$record'
      }
    },
    n: { $sum: 1 },
    doc: {
      $first: '$$ROOT'
    }
  }
})

const groupByItemCateg = depth => ({
  $group: {
    _id: id(depth),
    datum: {
      $first: {
        $arrayElemAt: ['$doc.record', depth]
      }
    },
    n: { $sum: '$n' },
    records: {
      $push: {
        cellguid: '$datum.cellguid',
        value: '$datum.value',
        records: '$records',
        n: '$n'
      }
    },

    doc: { $first: '$doc' }
  }
})

const project = {
  $project: {
    cellguid:'$datum.cellguid',
    value: '$datum.value',
    records: 1,
    n: 1
  }
}
const stages = [
  initialize(2),
  // here we are 3-level nested
  // max array idx is 2, so we have group by on the 0th idx and 1st idx
  // should we be 4-level nested, we would need to prepend/unshift groupByItemCateg(2)
  groupByItemCateg(1),
  groupByItemCateg(0), // the top level group by
  project
]
printjson(db.products.aggregate(stages).toArray())

aire de jeux pour 3 profondeurs

2
grodzi 21 févr. 2021 à 19:05

Essayez cette requête:

db.products.aggregate([
    {
        $addFields: {
            "product_category": { $arrayElemAt: ["$record", 0] }
        }
    },
    {
        $addFields: {
            "product": { $slice: ["$record", 1, { $size: "$record" }] }
        }
    },
    { $unwind: "$product" },
    {
        $group: {
            _id: "$product",
            count: { $sum: 1 },
            product_category: { $first: "$product_category" }
        }
    },
    {
        $group: {
            _id: "$product_category",
            count: { $sum: "$count" },
            records: {
                $push: {
                    "cellguid": "$_id.cellguid",
                    "value": "$_id.value",
                    "count": "$count"
                }
            }
        }
    },
    {
        $project: {
            "_id": 0,
            "cellguid": "$_id.cellguid",
            "value": "$_id.value",
            "count": "$count",
            "records": "$records"
        }
    }
]);

Production:

/* 1 */
{
    "_id" : {
        "cellguid" : "N118M2J4",
        "value" : "Fruits"
    },
    "count" : 5,
    "records" : [
        {
            "cellguid" : "V671H8W7",
            "value" : "Mango",
            "count" : 2
        },
        {
            "cellguid" : "V671H8W7",
            "value" : "Plum",
            "count" : 1
        },
        {
            "cellguid" : "V671H8W7",
            "value" : "Apple",
            "count" : 1
        },
        {
            "cellguid" : "V671H8W7",
            "value" : "Orange",
            "count" : 1
        }
    ]
},

/* 2 */
{
    "_id" : {
        "cellguid" : "N118M2J4",
        "value" : "Drinks"
    },
    "count" : 9,
    "records" : [
        {
            "cellguid" : "V671H8W7",
            "value" : "Pepsi",
            "count" : 3
        },
        {
            "cellguid" : "V671H8W7",
            "value" : "Water",
            "count" : 3
        },
        {
            "cellguid" : "V671H8W7",
            "value" : "Coca Cola",
            "count" : 3
        }
    ]
}
0
Dheemanth Bhat 21 févr. 2021 à 16:15