$substrBytes - Amazon DocumentDB
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

$substrBytes

The $substrBytes operator in Amazon DocumentDB is used to extract a substring from a string based on a specified byte range. This operator is useful when you need to extract a substring from a string and the number of bytes required to represent each character in the string is important.

Unlike $substrCP, which operates on the number of Unicode code points, $substrBytes operates on the number of bytes required to represent the characters in the string. This can be particularly useful when working with strings that contain non-ASCII characters, as these characters may require more than one byte to represent.

*Note:* $substr has been deprecated since version 3.4. $substr is now an alias for $substrBytes.

Parameters

  • string: The input string from which the substring will be extracted.

  • startByte: The zero-based starting byte position of the substring to be extracted. A negative value can be used to specify a position from the end of the string.

  • length: The number of bytes in the substring to be extracted.

Example (MongoDB Shell)

In this example, we'll use $substrBytes to extract a substring from a string that contains non-ASCII characters.

Create sample documents

db.people.insertMany([ { "_id": 1, "Desk": "Düsseldorf-NRW-021" }, { "_id": 2, "Desk": "Bremerhaven-HBB-32a" }, { "_id": 3, "Desk": "Norderstedt-SHH-892.50" }, { "_id": 4, "Desk": "Brandenburg-BBB-78" } ]);

Query example

db.people.aggregate([ { $project: { "state": { $substrBytes: [ "$Desk", 12, 3] } } } ])

Output

{ "_id": 1, "state": "NRW" }, { "_id": 2, "state": "HBB" }, { "_id": 3, "state": "SHH" }, { "_id": 4, "state": "BBB" }

In this example, we use $substrBytes to extract a 3-byte substring starting from the 12th byte of the Desk field. This allows us to extract the 2-character state abbreviation, even though the string may contain non-ASCII characters.

Code examples

To view a code example for using the $substrBytes command, choose the tab for the language that you want to use:

Node.js
const { MongoClient } = require('mongodb'); async function example() { const client = await MongoClient.connect('mongodb://<username>:<password>@<cluster-endpoint>:27017/?tls=true&tlsCAFile=global-bundle.pem&replicaSet=rs0&readPreference=secondaryPreferred&retryWrites=false'); const db = client.db('test'); const people = db.collection('people'); const result = await people.aggregate([ { $project: { "state": { $substrBytes: ["$Desk", 12, 3] } } } ]).toArray(); console.log(result); client.close(); } example();
Python
from pymongo import MongoClient def example(): client = MongoClient('mongodb://<username>:<password>@<cluster-endpoint>:27017/?tls=true&tlsCAFile=global-bundle.pem&replicaSet=rs0&readPreference=secondaryPreferred&retryWrites=false') db = client.test people = db.people result = list(people.aggregate([ { '$project': { "state": { '$substrBytes': ["$Desk", 12, 3] } } } ])) print(result) client.close() example()