From c7d2abdd9872225426192126b60cdadded91f283 Mon Sep 17 00:00:00 2001 From: Mike Woofter Date: Fri, 26 Apr 2024 11:59:34 -0500 Subject: [PATCH 1/7] autobuilder From cc2db829381b6b759c80aa25bbe762906a65a9d1 Mon Sep 17 00:00:00 2001 From: Mike Woofter <108414937+mongoKart@users.noreply.github.com> Date: Fri, 7 Feb 2025 16:48:41 -0600 Subject: [PATCH 2/7] wip --- config/redirects | 1 + source/connect/mongoclient.txt | 59 +++++- source/faq.txt | 250 +---------------------- source/includes/write/unique-id-note.rst | 12 ++ source/tools.txt | 8 + source/write/bulk-write.txt | 6 +- source/write/insert.txt | 32 +-- 7 files changed, 78 insertions(+), 290 deletions(-) create mode 100644 source/includes/write/unique-id-note.rst diff --git a/config/redirects b/config/redirects index 24adc3fe..af8154f7 100644 --- a/config/redirects +++ b/config/redirects @@ -12,3 +12,4 @@ raw: ${prefix}/master -> ${base}/upcoming/ raw: ${prefix}/get-started/download-and-install/ -> ${base}/current/get-started/download-and-install/ [*-master]: ${prefix}/${version}/security/enterprise-authentication/ -> ${base}/${version}/security/authentication/ +[*-master]: ${prefix}/${version}/faq/ -> ${base}/${version}/ diff --git a/source/connect/mongoclient.txt b/source/connect/mongoclient.txt index b303329c..51c2dc51 100644 --- a/source/connect/mongoclient.txt +++ b/source/connect/mongoclient.txt @@ -168,13 +168,60 @@ constructor accepts. All parameters are optional. **Data type:** `TypeRegistry <{+api-root+}bson/codec_options.html#bson.codec_options.TypeRegistry>`__ -.. tip:: Reusing Your Client +Concurrent Execution +-------------------- - Because each ``MongoClient`` object represents a pool of connections to the - database, most applications require only a single instance of - ``MongoClient``, even across multiple requests. However, if you fork - a process, the child process *does* need its own ``MongoClient`` object. - To learn more, see the :ref:`FAQ ` page. +The following sections describe {+driver-short+}'s support for concurrent execution +mechanisms. + +Multithreading +~~~~~~~~~~~~~~ + +{+driver-short+} is thread-safe and provides built-in connection pooling +for threaded applications. +Because each ``MongoClient`` object represents a pool of connections to the +database, most applications require only a single instance of +``MongoClient``, even across multiple requests. + +.. _pymongo-forks: + +Multiple Forks +~~~~~~~~~~~~~~~ + +{+driver-short+} supports using the ``fork()`` method to create a new process. +However, if you fork a process, you must create a new ``MongoClient`` instance in the +child process. + +.. important:: Don't Pass a MongoClient to a Child Process + + If you use the ``fork()`` method to create a new process, don't pass an instance + of the ``MongoClient`` class from the parent process to the child process. This creates + a high probability of deadlock among ``MongoClient`` instances in the child process. + {+driver-short+} tries to issue a warning if this deadlock might occur. + +Multiprocessing +~~~~~~~~~~~~~~~ + +{+driver-short+} supports the Python ``multiprocessing`` module. +However, on Unix systems, the multiprocessing module spawns processes by using +the ``fork()`` method. This carries the same risks described in :ref:`` + +To use multiprocessing with {+driver-short+}, write code similar to the following example: + +.. code-block:: python + + # Each process creates its own instance of MongoClient. + def func(): + db = pymongo.MongoClient().mydb + # Do something with db. + + proc = multiprocessing.Process(target=func) + proc.start() + +.. important:: + + Do not copy an instance of the ``MongoClient`` class from the parent process to a child + process. Type Hints ---------- diff --git a/source/faq.txt b/source/faq.txt index 49edb8ae..7a8cc5a6 100644 --- a/source/faq.txt +++ b/source/faq.txt @@ -1,63 +1,4 @@ -.. _pymongo-faq: - -Frequently Asked Questions -========================== - -.. contents:: On this page - :local: - :backlinks: none - :depth: 1 - :class: singlecol - -.. facet:: - :name: genre - :values: reference - -.. meta:: - :keywords: errors, problems, help, troubleshoot - -Is {+driver-short+} Thread-Safe? ------------------------ - -Yes. {+driver-short+} is thread-safe and provides built-in connection pooling -for threaded applications. - -.. _pymongo-fork-safe: - -Is {+driver-short+} Fork-Safe? ---------------------- - -No. If you use the ``fork()`` method to create a new process, don't pass an instance -of the ``MongoClient`` class from the parent process to the child process. This creates -a high probability of deadlock among ``MongoClient`` instances in the child process. -Instead, create a new ``MongoClient`` instance in the child process. - -.. note:: - - {+driver-short+} tries to issue a warning if this deadlock might occur. - -Can I Use {+driver-short+} with Multiprocessing? ---------------------------------------- - -Yes. However, on Unix systems, the multiprocessing module spawns processes by using -the ``fork()`` method. This carries the same risks described in :ref:`` - -To use multiprocessing with {+driver-short+}, write code similar to the following example: - -.. code-block:: python - - # Each process creates its own instance of MongoClient. - def func(): - db = pymongo.MongoClient().mydb - # Do something with db. - - proc = multiprocessing.Process(target=func) - proc.start() - -.. important:: - - Do not copy an instance of the ``MongoClient`` class from the parent process to a child - process. +.. docs-landing/source/languages/python.txt Can {+driver-short+} Load the Results of a Query as a Pandas DataFrame? ----------------------------------------------------------------------- @@ -69,194 +10,6 @@ load MongoDB query result-sets as `NumPy ndarrays `__, or `Apache Arrow Tables `__. -How Does Connection Pooling Work in {+driver-short+}? --------------------------------------------- - -Every ``MongoClient`` instance has a built-in connection pool for each server -in your MongoDB topology. Connection pools open sockets on demand to -support concurrent requests to MongoDB in your application. - -The maximum size of each connection pool is set by the ``maxPoolSize`` option, which -defaults to ``100``. If the number of in-use connections to a server reaches -the value of ``maxPoolSize``, the next request to that server will wait -until a connection becomes available. - -In addition to the sockets needed to support your application's requests, -each ``MongoClient`` instance opens two more sockets per server -in your MongoDB topology for monitoring the server's state. -For example, a client connected to a three-node replica set opens six -monitoring sockets. If the application uses the default setting for -``maxPoolSize`` and only queries the primary (default) node, then -there can be at most ``106`` total connections in the connection pool. If the -application uses a :ref:`read preference ` to query the -secondary nodes, those connection pools grow and there can be -``306`` total connections. - -To support high numbers of concurrent MongoDB requests -within one process, you can increase ``maxPoolSize``. - -Connection pools are rate-limited. The ``maxConnecting`` option -determines the number of connections that the pool can create in -parallel at any time. For example, if the value of ``maxConnecting`` is -``2``, the third request that attempts to concurrently check out a -connection succeeds only when one the following cases occurs: - -- The connection pool finishes creating a connection and there are fewer - than ``maxPoolSize`` connections in the pool. -- An existing connection is checked back into the pool. -- The driver's ability to reuse existing connections improves due to - rate-limits on connection creation. - -You can set the minimum number of concurrent connections to -each server with the ``minPoolSize`` option, which defaults to ``0``. -The driver initializes the connection pool with this number of sockets. If -sockets are closed, causing the total number -of sockets (both in use and idle) to drop below the minimum, more -sockets are opened until the minimum is reached. - -You can set the maximum number of milliseconds that a connection can -remain idle in the pool by setting the ``maxIdleTimeMS`` option. -Once a connection has been idle for ``maxIdleTimeMS``, the connection -pool removes and replaces it. This option defaults to ``0`` (no limit). - -The following default configuration for a ``MongoClient`` works for most -applications: - -.. code-block:: python - - client = MongoClient(host, port) - -``MongoClient`` supports multiple concurrent requests. For each process, -create a client and reuse it for all operations in a process. This -practice is more efficient than creating a client for each request. - -The driver does not limit the number of requests that -can wait for sockets to become available, and it is the application's -responsibility to limit the size of its pool to bound queuing -during a load spike. Requests wait for the amount of time specified in -the ``waitQueueTimeoutMS`` option, which defaults to ``0`` (no limit). - -A request that waits more than the length of time defined by -``waitQueueTimeoutMS`` for a socket raises a ``ConnectionFailure`` error. Use this -option if it is more important to bound the duration of operations -during a load spike than it is to complete every operation. - -When ``MongoClient.close()`` is called by any request, the driver -closes all idle sockets and closes all sockets that are in -use as they are returned to the pool. Calling ``MongoClient.close()`` -closes only inactive sockets, so you cannot interrupt or terminate -any ongoing operations by using this method. The driver closes these -sockets only when the process completes. - -For more information, see the :manual:`Connection Pool Overview ` -in the {+mdb-server+} documentation. - -Why Does {+driver-short+} Add an _id Field to All My Documents? ------------------------------------------------------- - -When you use the ``Collection.insert_one()`` method, -``Collection.insert_many()`` method, or -``Collection.bulk_write()`` method to insert a document into MongoDB, -and that document does not -include an ``_id`` field, {+driver-short+} automatically adds this field for you. -It also sets the value of the field to an instance of ``ObjectId``. - -The following code example inserts a document without an ``_id`` field into MongoDB, then -prints the document. After it's inserted, the document contains an ``_id`` field whose -value is an instance of ``ObjectId``. - -.. code-block:: python - - >>> my_doc = {'x': 1} - >>> collection.insert_one(my_doc) - InsertOneResult(ObjectId('560db337fba522189f171720'), acknowledged=True) - >>> my_doc - {'x': 1, '_id': ObjectId('560db337fba522189f171720')} - -{+driver-short+} adds an ``_id`` field in this manner for a few reasons: - -- All MongoDB documents must have an ``_id`` field. -- If {+driver-short+} inserts a document without an ``_id`` field, MongoDB adds one - itself, but doesn't report the value back to {+driver-short+} for your application - to use. -- Copying the document before adding the ``_id`` field is - prohibitively expensive for most high-write-volume applications. - -.. tip:: - - If you don't want {+driver-short+} to add an ``_id`` to your documents, insert only - documents that your application has already added an ``_id`` field to. - -How Do I Change the Timeout Value for Cursors? ----------------------------------------------- - -MongoDB doesn't support custom timeouts for cursors, but you can turn off cursor -timeouts. To do so, pass the ``no_cursor_timeout=True`` option to -the ``find()`` method. - -How Can I Store ``Decimal`` Instances? --------------------------------------- - -MongoDB v3.4 introduced the ``Decimal128`` BSON type, a 128-bit decimal-based -floating-point value capable of emulating decimal rounding with exact precision. -{+driver-short+} versions 3.4 and later also support this type. -Earlier MongoDB versions, however, support only IEEE 754 floating points, equivalent to the -Python ``float`` type. {+driver-short+} can store ``Decimal`` instances to -these versions of MongoDB only by converting them to the ``float`` type. -You must perform this conversion explicitly. - -For more information, see the {+driver-short+} API documentation for -`decimal128. `__ - -Why Does {+driver-short+} Convert ``9.99`` to ``9.9900000000000002``? ---------------------------------------------------------------------- - -MongoDB represents ``9.99`` as an IEEE floating-point value, which can't -represent the value precisely. This is also true in some versions of -Python. In this regard, {+driver-short+} behaves the same way as -the JavaScript shell, all other MongoDB drivers, and the Python language itself. - -Does {+driver-short+} Support Attribute-style Access for Documents? ----------------------------------------------------------- - -No. {+driver-short+} doesn't implement this feature, for the following reasons: - -1. Adding attributes pollutes the attribute namespace for documents and could - lead to subtle bugs or confusing errors when using a key with the - same name as a dictionary method. - -#. {+driver-short+} uses SON objects instead of regular - dictionaries only to maintain key ordering, because the server - requires this for certain operations. Adding this feature would - complicate the ``SON`` class and could break backwards compatibility - if {+driver-short+} ever reverts to using dictionaries. - -#. Documents behave just like dictionaries, which makes them relatively simple - for new {+driver-short+} users to understand. Changing the behavior of documents - adds a barrier to entry for these users. - -For more information, see the relevant -`Jira case. `__ - -Does {+driver-short+} Support Asynchronous Frameworks? ---------------------------------------------- - -Yes. For more information, see the :ref:`` guide. - -Does {+driver-short+} Work with mod_wsgi? --------------------------------- - -Yes. See :ref:`pymongo-mod_wsgi` in the Tools guide. - -Does {+driver-short+} Work with PythonAnywhere? --------------------------------------- - -No. {+driver-short+} creates Python threads, which -`PythonAnywhere `__ does not support. - -For more information, see -the relevant `Jira ticket. `__ - How Can I Encode My Documents to JSON? -------------------------------------- @@ -278,7 +31,6 @@ depend on {+driver-short+} and might offer a performance improvement over python-bsonjs works best with {+driver-short+} when using the ``RawBSONDocument`` type. - Does {+driver-short+} Behave Differently in Python 3? ----------------------------------------------------- diff --git a/source/includes/write/unique-id-note.rst b/source/includes/write/unique-id-note.rst new file mode 100644 index 00000000..d9238070 --- /dev/null +++ b/source/includes/write/unique-id-note.rst @@ -0,0 +1,12 @@ +.. note:: _id Field Must Be Unique + + In a MongoDB collection, each document must contain an ``_id`` field + with a unique value. + + If you specify a value for the ``_id`` field, you must ensure that the + value is unique across the collection. If you don't specify a value, + the driver automatically generates a unique ``ObjectId`` value for the field. + + We recommend letting the driver automatically generate ``_id`` values to + ensure uniqueness. Duplicate ``_id`` values violate unique index constraints, which + causes the driver to return an error. \ No newline at end of file diff --git a/source/tools.txt b/source/tools.txt index 7eee7638..bf70301a 100644 --- a/source/tools.txt +++ b/source/tools.txt @@ -274,3 +274,11 @@ This section lists alternatives to {+driver-short+}. - `MongoMock `__ is a small library to help test Python code. It uses {+driver-short+} to interact with MongoDB. +.. note:: {+driver-short+} is Incompatible with PythonAnywhere + + {+driver-short+} creates Python threads, which + `PythonAnywhere `__ does not support. + + For more information, see + the relevant `Jira ticket. `__ + diff --git a/source/write/bulk-write.txt b/source/write/bulk-write.txt index 3a8d5b81..7978b4e0 100644 --- a/source/write/bulk-write.txt +++ b/source/write/bulk-write.txt @@ -67,11 +67,7 @@ The following example creates an instance of ``InsertOne``: To insert multiple documents, create an instance of ``InsertOne`` for each document. -.. note:: - - Duplicate ``_id`` values violate unique index constraints, which causes the - driver to return a ``DuplicateKeyError``. To avoid this error, ensure that - each document you insert has a unique ``_id`` value. +.. include:: /includes/write/unique-id-note.rst Update Operations ~~~~~~~~~~~~~~~~~ diff --git a/source/write/insert.txt b/source/write/insert.txt index 738604d5..19ed2b42 100644 --- a/source/write/insert.txt +++ b/source/write/insert.txt @@ -27,6 +27,8 @@ An insert operation inserts one or more documents into a MongoDB collection. You can perform an insert operation by using the ``insert_one()`` or ``insert_many()`` method. +.. include:: /includes/write/unique-id-note.rst + .. .. tip:: Interactive Lab .. This page includes a short interactive lab that demonstrates how to @@ -45,36 +47,6 @@ from the :atlas:`Atlas sample datasets `. To learn how to create a free MongoDB Atlas cluster and load the sample datasets, see the :ref:`` tutorial. -The ``_id`` Field ------------------ - -In a MongoDB collection, each document *must* contain an ``_id`` field -with a unique field value. - -MongoDB allows you to manage this field in two ways: - -- You can set this field for each document yourself, ensuring each - ``_id`` field value is unique. -- You can let the driver automatically generate unique ``ObjectId`` - values for each document ``_id``. If you do not manually set an - ``_id`` value for a document, the driver populates the field - with an ``ObjectId``. - -Unless you can guarantee uniqueness, we recommend -letting the driver automatically generate ``_id`` values. - -.. note:: - - Duplicate ``_id`` values violate unique index constraints, which - causes the driver to return a ``WriteError`` from - ``insert_one()`` or a ``BulkWriteError`` from ``insert_many()``. - -To learn more about the ``_id`` field, see the -:manual:`Unique Indexes ` guide in the {+mdb-server+} manual. - -To learn more about document structure and rules, see the -:manual:`Documents ` guide in the {+mdb-server+} manual. - Insert One Document ------------------- From 8f9b0dcbff2031b74a29add0b791ed72585405c5 Mon Sep 17 00:00:00 2001 From: Mike Woofter <108414937+mongoKart@users.noreply.github.com> Date: Fri, 14 Feb 2025 16:06:10 -0600 Subject: [PATCH 3/7] first draft --- source/data-formats/extended-json.txt | 73 ++++++++- source/faq.txt | 148 ------------------ .../language-compatibility-table-pymongo.rst | 136 +++++++++++++++- source/read/retrieve.txt | 11 +- source/serialization.txt | 42 +++++ 5 files changed, 258 insertions(+), 152 deletions(-) diff --git a/source/data-formats/extended-json.txt b/source/data-formats/extended-json.txt index 7272b6f1..635d5cb5 100644 --- a/source/data-formats/extended-json.txt +++ b/source/data-formats/extended-json.txt @@ -178,6 +178,57 @@ list of dictionaries by using the ``loads()`` method: {'bin': Binary(b'\x01\x02\x03\x04', 128)} ] +.. _pymongo-extended-json-binary-values: + +Reading Binary Values in Python 2 +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In Python 3, the driver decodes JSON binary values with subtype 0 to instances of the +``bytes`` class. In Python 2, the driver decodes these values to instances of the ``Binary`` +class with subtype 0. + +The following code examples show how {+driver-short+} decodes JSON binary isntances with +subtype 0. Select the :guilabel:`Python 2` or :guilabel:`Python 3` tab to view the +corresponding code. + +.. tabs:: + + .. tab:: Python 2 + :tabid: python2 + + .. io-code-block:: + :copyable: true + + .. input:: + :language: python + + from bson.json_util import loads + + doc = loads('{"b": {"$binary': b'this is a byte string'}) + print(doc) + + .. output:: + + {u'b': Binary('this is a byte string', 0)} + + .. tab:: Python 3 + :tabid: python3 + + .. io-code-block:: + :copyable: true + + .. input:: + :language: python + + from bson.json_util import loads + + doc = loads('{"b": {"$binary': b'this is a byte string'}) + print(doc) + + .. output:: + + {'b': b'this is a byte string'} + Write Extended JSON ------------------- @@ -273,10 +324,30 @@ The following example shows how to output Extended JSON in the Canonical format: Additional Information ---------------------- +The resources in the following sections provide more information about working +with Extended JSON. + +API Documentation +~~~~~~~~~~~~~~~~~ + For more information about the methods and types in ``bson.json_util``, see the following API documentation: - `loads() <{+api-root+}bson/json_util.html#bson.json_util.loads>`__ - `dumps() <{+api-root+}bson/json_util.html#bson.json_util.dumps>`__ - `CANONICAL_JSON_OPTIONS <{+api-root+}bson/json_util.html#bson.json_util.CANONICAL_JSON_OPTIONS>`__ -- `LEGACY_JSON_OPTIONS <{+api-root+}bson/json_util.html#bson.json_util.LEGACY_JSON_OPTIONS>`__ \ No newline at end of file +- `LEGACY_JSON_OPTIONS <{+api-root+}bson/json_util.html#bson.json_util.LEGACY_JSON_OPTIONS>`__ + +Other Packages +~~~~~~~~~~~~~~ + +`python-bsonjs `__ is another package, +built on top of `libbson `__, +that can convert BSON to Extended JSON. The ``python-bsonjs`` package doesn't +depend on {+driver-short+} and might offer a performance improvement over +``json_util`` in certain cases. + +.. tip:: Use the RawBSONDocument Type + + ``python-bsonjs`` works best with {+driver-short+} when converting from the + ``RawBSONDocument`` type. \ No newline at end of file diff --git a/source/faq.txt b/source/faq.txt index 7a8cc5a6..e69de29b 100644 --- a/source/faq.txt +++ b/source/faq.txt @@ -1,148 +0,0 @@ -.. docs-landing/source/languages/python.txt - -Can {+driver-short+} Load the Results of a Query as a Pandas DataFrame? ------------------------------------------------------------------------ - -You can use the `PyMongoArrow `__ -library to work with numerical or columnar data. PyMongoArrow lets you -load MongoDB query result-sets as -`Pandas DataFrames `__, -`NumPy ndarrays `__, or -`Apache Arrow Tables `__. - -How Can I Encode My Documents to JSON? --------------------------------------- - -{+driver-short+} supports some special types, like ``ObjectId`` -and ``DBRef``, that aren't supported in JSON. Therefore, Python's ``json`` module won't -work with all documents in {+driver-short+}. Instead, {+driver-short+} includes the -`json_util `__ -module, a tool for using Python's ``json`` module with BSON documents and -`MongoDB Extended JSON `__. - -`python-bsonjs `__ is another -BSON-to-MongoDB-Extended-JSON converter, built on top of -`libbson `__. python-bsonjs doesn't -depend on {+driver-short+} and might offer a performance improvement over -``json_util`` in certain cases. - -.. tip:: - - python-bsonjs works best with {+driver-short+} when using the ``RawBSONDocument`` - type. - -Does {+driver-short+} Behave Differently in Python 3? ------------------------------------------------------ - -{+driver-short+} encodes instances of the ``bytes`` class -as BSON type 5 (binary data) with subtype 0. -In Python 2, these instances are decoded to ``Binary`` -with subtype 0. In Python 3, they are decoded back to ``bytes``. - -The following code examples use {+driver-short+} to insert a ``bytes`` instance -into MongoDB, and then find the instance. -In Python 2, the byte string is decoded to ``Binary``. -In Python 3, the byte string is decoded back to ``bytes``. - -.. tabs:: - - .. tab:: Python 2.7 - :tabid: python-2 - - .. code-block:: python - - >>> import pymongo - >>> c = pymongo.MongoClient() - >>> c.test.bintest.insert_one({'binary': b'this is a byte string'}).inserted_id - ObjectId('4f9086b1fba5222021000000') - >>> c.test.bintest.find_one() - {u'binary': Binary('this is a byte string', 0), u'_id': ObjectId('4f9086b1fba5222021000000')} - - .. tab:: Python 3.7 - :tabid: python-3 - - .. code-block:: python - - >>> import pymongo - >>> c = pymongo.MongoClient() - >>> c.test.bintest.insert_one({'binary': b'this is a byte string'}).inserted_id - ObjectId('4f9086b1fba5222021000000') - >>> c.test.bintest.find_one() - {'binary': b'this is a byte string', '_id': ObjectId('4f9086b1fba5222021000000')} - -Similarly, Python 2 and 3 behave differently when {+driver-short+} parses JSON binary -values with subtype 0. In Python 2, these values are decoded to instances of ``Binary`` -with subtype 0. In Python 3, they're decoded into instances of ``bytes``. - -The following code examples use the ``json_util`` module to decode a JSON binary value -with subtype 0. In Python 2, the byte string is decoded to ``Binary``. -In Python 3, the byte string is decoded back to ``bytes``. - -.. tabs:: - - .. tab:: Python 2.7 - :tabid: python-2 - - .. code-block:: python - - >>> from bson.json_util import loads - >>> loads('{"b": {"$binary": "dGhpcyBpcyBhIGJ5dGUgc3RyaW5n", "$type": "00"}}') - {u'b': Binary('this is a byte string', 0)} - - .. tab:: Python 3.7 - :tabid: python-3 - - .. code-block:: python - - >>> from bson.json_util import loads - >>> loads('{"b": {"$binary": "dGhpcyBpcyBhIGJ5dGUgc3RyaW5n", "$type": "00"}}') - {'b': b'this is a byte string'} - -Can I Share Pickled ObjectIds Between Python 2 and Python 3? ------------------------------------------------------------- - -If you use Python 2 to pickle an instance of ``ObjectId``, -you can always unpickle it with Python 3. To do so, you must pass -the ``encoding='latin-1'`` option to the ``pickle.loads()`` method. -The following code example shows how to pickle an ``ObjectId`` in Python 2.7, and then -unpickle it in Python 3.7: - -.. code-block:: python - :emphasize-lines: 12 - - # Python 2.7 - >>> import pickle - >>> from bson.objectid import ObjectId - >>> oid = ObjectId() - >>> oid - ObjectId('4f919ba2fba5225b84000000') - >>> pickle.dumps(oid) - 'ccopy_reg\n_reconstructor\np0\n(cbson.objectid\...' - - # Python 3.7 - >>> import pickle - >>> pickle.loads(b'ccopy_reg\n_reconstructor\np0\n(cbson.objectid\...', encoding='latin-1') - ObjectId('4f919ba2fba5225b84000000') - -If you pickled an ``ObjectID`` in Python 2, and want to unpickle it in Python 3, -you must pass the ``protocol`` argument with a value of ``2`` or less to the -``pickle.dumps()`` method. -The following code example shows how to pickle an ``ObjectId`` in Python 3.7, and then -unpickle it in Python 2.7: - -.. code-block:: python - :emphasize-lines: 7 - - # Python 3.7 - >>> import pickle - >>> from bson.objectid import ObjectId - >>> oid = ObjectId() - >>> oid - ObjectId('4f96f20c430ee6bd06000000') - >>> pickle.dumps(oid, protocol=2) - b'\x80\x02cbson.objectid\nObjectId\nq\x00)\x81q\x01c_codecs\nencode\...' - - # Python 2.7 - >>> import pickle - >>> pickle.loads('\x80\x02cbson.objectid\nObjectId\nq\x00)\x81q\x01c_codecs\nencode\...') - ObjectId('4f96f20c430ee6bd06000000') diff --git a/source/includes/language-compatibility-table-pymongo.rst b/source/includes/language-compatibility-table-pymongo.rst index 6ae014a9..bf848cda 100644 --- a/source/includes/language-compatibility-table-pymongo.rst +++ b/source/includes/language-compatibility-table-pymongo.rst @@ -196,5 +196,137 @@ Python 3 Python 2 ~~~~~~~~ -{+driver-short+} versions 3.7 through 3.12 are compatible with Python 2.7 and PyPy, a Python 2.7- -compatible alternative interpreter. +{+driver-short+} versions 3.7 through 3.12 are compatible with Python 2.7 and PyPy, a +Python 2.7-compatible alternative interpreter. However, in some cases, {+driver-short+} +applications behave differently when running in a Python 2 environment. + +The following sections describe the differences in behavior between Python 2 and Python 3 +when using {+driver-short+}. + +Binary Data +``````````` + +In all versions of Python, {+driver-short+} encodes instances of the +`bytes `__ class +as binary data with subtype 0, the default subtype for binary data. In Python 3, +{+driver-short+} decodes these values to instances of the ``bytes`` class. In Python 2, +the driver decodes them to instances of the +`Binary `__ +class with subtype 0. + +The following code examples show how {+driver-short+} decodes instances of the ``bytes`` +class. Select the :guilabel:`Python 2` or :guilabel:`Python 3` tab to view the corresponding +code. + +.. tabs:: + + .. tab:: Python 2 + :tabid: python2 + + .. io-code-block:: + :copyable: true + + .. input:: + :language: python + + from pymongo import MongoClient + + client = MongoClient() + client.test.test.insert_one({'binary': b'this is a byte string'}) + doc = client.test.test.find_one() + print(doc) + + .. output:: + + {u'_id': ObjectId('67afb78298f604a28f0247b4'), u'binary': Binary('this is a byte string', 0)} + + .. tab:: Python 3 + :tabid: python3 + + .. io-code-block:: + :copyable: true + + .. input:: + :language: python + + from pymongo import MongoClient + + client = MongoClient() + client.test.test.insert_one({'binary': b'this is a byte string'}) + doc = client.test.test.find_one() + print(doc) + + .. output:: + + {'_id': ObjectId('67afb78298f604a28f0247b4'), 'binary': b'this is a byte string'} + +The driver behaves the same way when decoding JSON binary values with subtype 0. In +Python 3, it decodes these values to instances of the ``bytes`` class. In Python 2, +the driver decodes them to instances of the ``Binary`` class with subtype 0. For code +examples that show the differences, see the +:ref:`Extended JSON ` page. + +Pickled ObjectIds +````````````````` + +If you pickled an ``ObjectId`` in Python 2 and want to unpickle it in Python 3, you must +pass ``encoding='latin-1'`` as an argument to the ``pickle.loads()`` method. + +The following example shows how to use Python 3 to unpickle an ``ObjectId`` that was +pickled in Python 2: + +.. code-block:: python + :emphasize-lines: 2 + + import pickle + pickle.loads(b'', encoding='latin-1') + +If a Python 3 application uses a compatible serialization protocol to pickle an ``ObjectId``, +you can use Python 2 to unpickle it. To specify a compatible protocol in Python 3, pass +a value of 0, 1, or 2 for the ``protocol`` parameter of the ``pickle.dumps()`` method. + +The following example pickles an ``ObjectId`` in Python 3, then prints the ``ObjectId`` +and resulting ``bytes`` instance: + +.. io-code-block:: + :copyable: true + + .. input:: + :language: python + + import pickle + from bson.objectid import ObjectId + + oid = ObjectId() + oid_bytes = pickle.dumps(oid, protocol=2) + print("ObjectId: {}".format(oid)) + print("ObjectId bytes: {}".format(oid_bytes)) + + .. output:: + :language: shell + + ObjectId: 67af9b1fae9260c0e97eb9eb + ObjectId bytes: b'\x80\x02cbson.objectid\nObjectId\nq\x00... + +The following example unpickles the ``ObjectId`` from the previous example, and then +prints the ``bytes`` and ``ObjectId`` instances: + +.. io-code-block:: + :copyable: true + + .. input:: + :language: python + + import pickle + from bson.objectid import ObjectId + + oid_bytes = b'\x80\x02cbson.objectid\nObjectId\nq\x00...' + oid = pickle.loads(oid_bytes) + print("ObjectId bytes: {}".format(oid_bytes)) + print("ObjectId: {}".format(oid)) + + .. output:: + :language: shell + + ObjectId bytes: b'\x80\x02cbson.objectid\nObjectId\nq\x00)... + ObjectId: 67af9b1fae9260c0e97eb9eb \ No newline at end of file diff --git a/source/read/retrieve.txt b/source/read/retrieve.txt index ea3434b5..f04b6b09 100644 --- a/source/read/retrieve.txt +++ b/source/read/retrieve.txt @@ -81,7 +81,9 @@ the ``"cuisine"`` field has the value ``"Bakery"``: :manual:`natural order ` on disk if no sort criteria is specified. -To learn more about sorting, see the :ref:`sort guide `. + To learn more about sorting, see the :ref:`sort guide `. + + .. _pymongo-retrieve-find-multiple: @@ -204,6 +206,13 @@ to the ``find()`` method: Additional Information ---------------------- +The PyMongoArrow library lets you load MongoDB query result-sets as +`Pandas DataFrames `__, +`NumPy ndarrays `__, or +`Apache Arrow Tables `__. +To learn more about PyMongoArrow, see the +`PyMongoArrow documentation `__. + To learn more about query filters, see :ref:`pymongo-specify-query`. For runnable code examples of retrieving documents with {+driver-short+}, see diff --git a/source/serialization.txt b/source/serialization.txt index 99c5c432..8be2fabd 100644 --- a/source/serialization.txt +++ b/source/serialization.txt @@ -90,3 +90,45 @@ it back into a ``Restaurant`` object from the preceding example: To learn more about retrieving documents from a collection, see the :ref:`pymongo-retrieve` guide. + +Binary Fields +------------- + +In all versions of Python, {+driver-short+} encodes instances of the +`bytes `__ class +as binary data with subtype 0, the default subtype for binary data. In Python 3, +{+driver-short+} decodes these values to instances of the ``bytes`` class. In Python 2, +however, the driver decodes them to instances of the +`Binary `__ +class with subtype 0. + +The following code examples use {+driver-short+} to insert a ``bytes`` instance +into MongoDB, and then find the instance. +In Python 2, the byte string is decoded to ``Binary``. +In Python 3, the byte string is decoded back to ``bytes``. + +.. tabs:: + + .. tab:: Python 2.7 + :tabid: python-2 + + .. code-block:: python + + >>> import pymongo + >>> c = pymongo.MongoClient() + >>> c.test.bintest.insert_one().inserted_id + ObjectId('4f9086b1fba5222021000000') + >>> c.test.bintest.find_one() + {u'binary': Binary('this is a byte string', 0), u'_id': ObjectId('4f9086b1fba5222021000000')} + + .. tab:: Python 3.7 + :tabid: python-3 + + .. code-block:: python + + >>> import pymongo + >>> c = pymongo.MongoClient() + >>> c.test.bintest.insert_one({'binary': b'this is a byte string'}).inserted_id + ObjectId('4f9086b1fba5222021000000') + >>> c.test.bintest.find_one() + {'binary': b'this is a byte string', '_id': ObjectId('4f9086b1fba5222021000000')} From 044a1a913d87e77dc3e8e5df4ecf081fe56efae1 Mon Sep 17 00:00:00 2001 From: Mike Woofter <108414937+mongoKart@users.noreply.github.com> Date: Fri, 14 Feb 2025 16:15:46 -0600 Subject: [PATCH 4/7] small fixes --- source/compatibility.txt | 2 +- .../language-compatibility-table-pymongo.rst | 49 +------------ source/read/retrieve.txt | 2 - source/serialization.txt | 69 ++++++++++++------- 4 files changed, 46 insertions(+), 76 deletions(-) diff --git a/source/compatibility.txt b/source/compatibility.txt index 72efaab7..d6e6d4f2 100644 --- a/source/compatibility.txt +++ b/source/compatibility.txt @@ -50,5 +50,5 @@ The first column lists the driver version. .. include:: /includes/language-compatibility-table-pymongo.rst -For more information on how to read the compatibility tables, see our guide on +For more information about how to read the compatibility tables, see :ref:`MongoDB Compatibility Tables. ` diff --git a/source/includes/language-compatibility-table-pymongo.rst b/source/includes/language-compatibility-table-pymongo.rst index bf848cda..ae959aef 100644 --- a/source/includes/language-compatibility-table-pymongo.rst +++ b/source/includes/language-compatibility-table-pymongo.rst @@ -212,53 +212,8 @@ as binary data with subtype 0, the default subtype for binary data. In Python 3, {+driver-short+} decodes these values to instances of the ``bytes`` class. In Python 2, the driver decodes them to instances of the `Binary `__ -class with subtype 0. - -The following code examples show how {+driver-short+} decodes instances of the ``bytes`` -class. Select the :guilabel:`Python 2` or :guilabel:`Python 3` tab to view the corresponding -code. - -.. tabs:: - - .. tab:: Python 2 - :tabid: python2 - - .. io-code-block:: - :copyable: true - - .. input:: - :language: python - - from pymongo import MongoClient - - client = MongoClient() - client.test.test.insert_one({'binary': b'this is a byte string'}) - doc = client.test.test.find_one() - print(doc) - - .. output:: - - {u'_id': ObjectId('67afb78298f604a28f0247b4'), u'binary': Binary('this is a byte string', 0)} - - .. tab:: Python 3 - :tabid: python3 - - .. io-code-block:: - :copyable: true - - .. input:: - :language: python - - from pymongo import MongoClient - - client = MongoClient() - client.test.test.insert_one({'binary': b'this is a byte string'}) - doc = client.test.test.find_one() - print(doc) - - .. output:: - - {'_id': ObjectId('67afb78298f604a28f0247b4'), 'binary': b'this is a byte string'} +class with subtype 0. For code examples that show the differences, see the +:ref:`Extended JSON ` page. The driver behaves the same way when decoding JSON binary values with subtype 0. In Python 3, it decodes these values to instances of the ``bytes`` class. In Python 2, diff --git a/source/read/retrieve.txt b/source/read/retrieve.txt index f04b6b09..9aa955a1 100644 --- a/source/read/retrieve.txt +++ b/source/read/retrieve.txt @@ -83,8 +83,6 @@ the ``"cuisine"`` field has the value ``"Bakery"``: To learn more about sorting, see the :ref:`sort guide `. - - .. _pymongo-retrieve-find-multiple: Find Multiple Documents diff --git a/source/serialization.txt b/source/serialization.txt index 8be2fabd..c71c0035 100644 --- a/source/serialization.txt +++ b/source/serialization.txt @@ -91,44 +91,61 @@ it back into a ``Restaurant`` object from the preceding example: To learn more about retrieving documents from a collection, see the :ref:`pymongo-retrieve` guide. -Binary Fields -------------- +.. _pymongo-serialization-binary-data: + +Binary Data +----------- In all versions of Python, {+driver-short+} encodes instances of the `bytes `__ class as binary data with subtype 0, the default subtype for binary data. In Python 3, {+driver-short+} decodes these values to instances of the ``bytes`` class. In Python 2, -however, the driver decodes them to instances of the +the driver decodes them to instances of the `Binary `__ class with subtype 0. -The following code examples use {+driver-short+} to insert a ``bytes`` instance -into MongoDB, and then find the instance. -In Python 2, the byte string is decoded to ``Binary``. -In Python 3, the byte string is decoded back to ``bytes``. +The following code examples show how {+driver-short+} decodes instances of the ``bytes`` +class. Select the :guilabel:`Python 2` or :guilabel:`Python 3` tab to view the corresponding +code. .. tabs:: - .. tab:: Python 2.7 - :tabid: python-2 + .. tab:: Python 2 + :tabid: python2 - .. code-block:: python + .. io-code-block:: + :copyable: true - >>> import pymongo - >>> c = pymongo.MongoClient() - >>> c.test.bintest.insert_one().inserted_id - ObjectId('4f9086b1fba5222021000000') - >>> c.test.bintest.find_one() - {u'binary': Binary('this is a byte string', 0), u'_id': ObjectId('4f9086b1fba5222021000000')} - - .. tab:: Python 3.7 - :tabid: python-3 + .. input:: + :language: python + + from pymongo import MongoClient + + client = MongoClient() + client.test.test.insert_one({'binary': b'this is a byte string'}) + doc = client.test.test.find_one() + print(doc) + + .. output:: + + {u'_id': ObjectId('67afb78298f604a28f0247b4'), u'binary': Binary('this is a byte string', 0)} + + .. tab:: Python 3 + :tabid: python3 + + .. io-code-block:: + :copyable: true + + .. input:: + :language: python + + from pymongo import MongoClient + + client = MongoClient() + client.test.test.insert_one({'binary': b'this is a byte string'}) + doc = client.test.test.find_one() + print(doc) - .. code-block:: python + .. output:: - >>> import pymongo - >>> c = pymongo.MongoClient() - >>> c.test.bintest.insert_one({'binary': b'this is a byte string'}).inserted_id - ObjectId('4f9086b1fba5222021000000') - >>> c.test.bintest.find_one() - {'binary': b'this is a byte string', '_id': ObjectId('4f9086b1fba5222021000000')} + {'_id': ObjectId('67afb78298f604a28f0247b4'), 'binary': b'this is a byte string'} \ No newline at end of file From 0274ca392a98d8ebf7e55e461f3cfce4ae4e1a3e Mon Sep 17 00:00:00 2001 From: Mike Woofter <108414937+mongoKart@users.noreply.github.com> Date: Fri, 14 Feb 2025 16:20:20 -0600 Subject: [PATCH 5/7] fix build error --- source/index.txt | 6 ------ 1 file changed, 6 deletions(-) diff --git a/source/index.txt b/source/index.txt index e9f192d1..86faa57e 100644 --- a/source/index.txt +++ b/source/index.txt @@ -123,12 +123,6 @@ Third-Party Tools For a list of popular third-party Python libraries for working with MongoDB, see the :ref:`pymongo-tools` section. -Frequently Asked questions --------------------------- - -For answers to commonly asked questions about {+driver-short+}, see the -:ref:`pymongo-faq` section. - Troubleshooting --------------- From 81d2c17638c437bb6286f743bd43596e9c2ce7ed Mon Sep 17 00:00:00 2001 From: Mike Woofter <108414937+mongoKart@users.noreply.github.com> Date: Tue, 18 Feb 2025 12:28:50 -0600 Subject: [PATCH 6/7] Apply suggestions from code review Co-authored-by: Jordan Smith <45415425+jordan-smith721@users.noreply.github.com> --- source/connect/mongoclient.txt | 2 +- source/data-formats/extended-json.txt | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/source/connect/mongoclient.txt b/source/connect/mongoclient.txt index 51c2dc51..683afc1d 100644 --- a/source/connect/mongoclient.txt +++ b/source/connect/mongoclient.txt @@ -188,7 +188,7 @@ database, most applications require only a single instance of Multiple Forks ~~~~~~~~~~~~~~~ -{+driver-short+} supports using the ``fork()`` method to create a new process. +{+driver-short+} supports calling the ``fork()`` method to create a new process. However, if you fork a process, you must create a new ``MongoClient`` instance in the child process. diff --git a/source/data-formats/extended-json.txt b/source/data-formats/extended-json.txt index 635d5cb5..5a3ad056 100644 --- a/source/data-formats/extended-json.txt +++ b/source/data-formats/extended-json.txt @@ -187,7 +187,7 @@ In Python 3, the driver decodes JSON binary values with subtype 0 to instances o ``bytes`` class. In Python 2, the driver decodes these values to instances of the ``Binary`` class with subtype 0. -The following code examples show how {+driver-short+} decodes JSON binary isntances with +The following code examples show how {+driver-short+} decodes JSON binary instances with subtype 0. Select the :guilabel:`Python 2` or :guilabel:`Python 3` tab to view the corresponding code. From f6a0eb243cca038ed0100ece2695cbbfb789b5d5 Mon Sep 17 00:00:00 2001 From: Mike Woofter <108414937+mongoKart@users.noreply.github.com> Date: Tue, 18 Feb 2025 12:29:39 -0600 Subject: [PATCH 7/7] js feedback --- source/compatibility.txt | 5 +---- source/includes/language-compatibility-table-pymongo.rst | 3 +++ 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/source/compatibility.txt b/source/compatibility.txt index d6e6d4f2..91d64e0c 100644 --- a/source/compatibility.txt +++ b/source/compatibility.txt @@ -48,7 +48,4 @@ The following compatibility table specifies the recommended version of {+driver-short+} for use with a specific version of Python. The first column lists the driver version. -.. include:: /includes/language-compatibility-table-pymongo.rst - -For more information about how to read the compatibility tables, see -:ref:`MongoDB Compatibility Tables. ` +.. include:: /includes/language-compatibility-table-pymongo.rst \ No newline at end of file diff --git a/source/includes/language-compatibility-table-pymongo.rst b/source/includes/language-compatibility-table-pymongo.rst index ae959aef..55347b1b 100644 --- a/source/includes/language-compatibility-table-pymongo.rst +++ b/source/includes/language-compatibility-table-pymongo.rst @@ -193,6 +193,9 @@ Python 3 :ref:`TLS ` section of the Troubleshooting guide. .. [#three-six-compat] Pymongo 4.1 requires Python 3.6.2 or later. +For more information about how to read the compatibility tables, see +:ref:`MongoDB Compatibility Tables. ` + Python 2 ~~~~~~~~