Skip to content

fix(NODE-6878): documents.clear() throws a TypeError after cursor is rewound #4488

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Apr 18, 2025

Conversation

italojs
Copy link
Contributor

@italojs italojs commented Mar 26, 2025

Jira issue: https://jira.mongodb.org/browse/NODE-6878

MongoDB Driver: documents?.clear() is not a function error in AbstractCursor.rewind()

Issue Description

When using the MongoDB Node.js driver, there's an issue in the rewind() method of AbstractCursor. The method attempts to call this.documents?.clear() without checking if the clear() method exists on the documents object.

In certain scenarios, particularly when the MongoDB driver uses an optimization called emptyGetMore to avoid unnecessary server calls, this.documents is set to a simplified object literal that lacks the clear() method:

CursorResponse.emptyGetMore = {
    id: new bson_1.Long(0),
    length: 0,
    shift: () => null
    // Note: no clear() method defined
};

This causes an error when rewind() is called on cursors that have this optimization applied:

rewind() {
    if (!this.initialized) {
        return;
    }
    this.cursorId = null;
    this.documents?.clear(); // <-- This line causes the error
    this.isClosed = false;
    this.isKilled = false;
    this.initialized = false;
    // ...
}

Impact

This issue can cause applications to crash when:

  1. Using cursor operations that internally call rewind()
  2. Working with cursors that have been optimized using emptyGetMore
  3. Attempting to reset or reuse cursors after they've been exhausted

Reproduction Details

In this repository, the reproduction.js script demonstrates this issue by:

  1. Creating a MongoDB cursor with a small batch size
  2. Consuming all documents to exhaust the cursor
  3. Simulating the emptyGetMore optimization by replacing the documents object with one that lacks a clear() method
  4. Attempting to rewind the cursor, which triggers the error
  5. Trying to reuse the cursor, which would normally work if rewind() succeeded

Running the Reproduction

# Install dependencies
npm i

# Run the reproduction script
node reproduction.js

Expected output will show the error:

Rewind error: this.documents?.clear is not a function
Error name: TypeError

Proposed Solution

The issue can be fixed by modifying the rewind() method to safely handle cases where documents doesn't have a clear() method. Here's the proposed fix:

rewind() {
    if (!this.initialized) {
        return;
    }
    this.cursorId = null;
    
    // Check if documents exists and has a clear method before calling it
    if (this.documents && typeof this.documents.clear === 'function') {
        this.documents.clear();
    } else {
        // If documents is the emptyGetMore object or doesn't have the clear() method
        // Simply set it to null to force reinitialization
        this.documents = null;
    }
    
    this.isClosed = false;
    this.isKilled = false;
    this.initialized = false;
    // ... rest of the method
}

Why This Solution Works

  1. Safety: The solution checks if clear() exists before calling it
  2. Functionality: Setting documents to null achieves the same goal as clear() - it resets the cursor's document buffer
  3. Compatibility: The solution maintains compatibility with both regular cursors and optimized emptyGetMore cursors
  4. Performance: The solution doesn't introduce any significant overhead

References

FindCursor.rewind() throws documents?.clear() is not a function errors in certain scenarios

In certain scenarios where limit and batchSize are both set on a FindCursor, an internal driver optimization intended to prevent unnecessary requests to the server when the driver knows the cursor is exhausted would prevent the cursor from being rewound. This issue has been resolved.

@italojs italojs requested a review from a team as a code owner March 26, 2025 00:45
@baileympearson baileympearson added the External Submission PR submitted from outside the team label Mar 26, 2025
@dariakp dariakp added the tracked-in-jira Ticket filed in MongoDB's Jira system label Mar 26, 2025
@dariakp dariakp changed the title fix(cursor): ensure documents are cleared correctly to handle emptyGe… fix(NODE-6878): ensure documents are cleared correctly to handle emptyGe… Mar 26, 2025
@baileympearson baileympearson self-assigned this Mar 26, 2025
@baileympearson baileympearson added the Primary Review In Review with primary reviewer, not yet ready for team's eyes label Mar 26, 2025
@baileympearson
Copy link
Contributor

Hey @italojs ,

Thanks for the submission and good catch! We can reproduce this issue as well. The culprit behind this issue seems to be a Typescript cast as unknown as CursorResponse:

} as unknown as CursorResponse;
. Your solution seems to make the error go away but we're concerned that other errors might arise in the future and be squashed by the same Typescript cast.

We think that a more future-proof approach might be to directly instantiate a CursorResponse:

static get emptyGetMore(): CursorResponse {
  return new CursorResponse(
    serialize({ ok: 1, cursor: { id: 0, nextBatch: [] } })
  )
}

Would you be willing to make this change? The static getter is important here to avoid serialization at the module-level and only serialize when needed.

Also, we'll need a test for this scenario to confirm that the behavior has been fixed. Would you be willing to add a regression test? I believe find.test.js is a good place for a test, and we can trigger the error by:

  1. testing on sharded clusters
  2. insert a few documents into a collection
  3. create a cursor with a limit and batch size, where limit is a multiple of the batch size and also less than the number of documents inserted
  4. iterate the cursor limit + 1 times
  5. rewind the cursor

Without the changes in this PR, the above throws an error. I'm also happy to make these changes if you'd prefer, just let me know.

Thanks again!

@italojs italojs force-pushed the fix/rewind branch 4 times, most recently from 1e8494e to dfcf109 Compare April 1, 2025 20:21
@W-A-James W-A-James assigned W-A-James and unassigned baileympearson Apr 2, 2025
@W-A-James W-A-James requested review from W-A-James and removed request for a team April 2, 2025 18:30
Copy link
Contributor

@W-A-James W-A-James left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@italojs Seems like you're failing these tests

@italojs
Copy link
Contributor Author

italojs commented Apr 4, 2025

@W-A-James, How can I see the CI output tests? It's passing on my machine, but I would like to know if it's an environmental issue or a flaky error.


      #count()
        ✔ returns the number of documents
      #forEach()
        15) iterates all the documents
      #toArray()
        16) returns an array with all documents
      #next()
        17) is callable without blocking
      #stream()
        18) creates a node stream that emits data events
      #explain()
        ✔ returns an explain document
  
   

In the GitHub CI, it says I must have to a mongodb.com account. I'm a little bit lot on it to be honest
screenshot in Portuguese

Screenshot 2025-04-03 at 23 49 58

@W-A-James
Copy link
Contributor

@italojs Apologies for the delay, here are the relevant logs: https://gist.github.com./W-A-James/080802d7f36ef774f06641a1b13ffc75

It doesn't seem like we'd be able to grant you access to our CI platform

…rove cursor rewind handling

- Refactored emptyGetMore to return a serialized CursorResponse instance.
- Enhanced the id getter to handle cases where the cursor id may not be present, defaulting to 0.
- Added integration tests to verify cursor rewind functionality after applying the emptyGetMore optimization, ensuring correct behavior when the cursor is exhausted.
@italojs
Copy link
Contributor Author

italojs commented Apr 11, 2025

@W-A-James @baileympearson Ok, can you trigger the CI again, please?
I ran a few local tests for ./test/tools/cluster_setup.sh sharded_cluster', ./test/tools/cluster_setup.sh replica_set, and ./test/tools/cluster_setup.sh server, but a few tests were skipped; I'm not sure why.
hope all tests pass in the CI.

@italojs
Copy link
Contributor Author

italojs commented Apr 14, 2025

The lint CI is not passing, but locally I have this output: no errors, just warnings.

 npm run check:lint

> [email protected] check:lint
> npm run build:dts && npm run check:dts && npm run check:eslint && npm run check:tsd


> [email protected] build:dts
> npm run build:ts && api-extractor run && node etc/clean_definition_files.cjs && ESLINT_USE_FLAT_CONFIG=false eslint --no-ignore --fix mongodb.d.ts lib/beta.d.ts


> [email protected] build:ts
> node ./node_modules/typescript/bin/tsc


api-extractor 7.50.0  - https://api-extractor.com/

Using configuration from ./api-extractor.json
Analysis will use the bundled TypeScript version 5.7.2

API Extractor completed successfully
(node:85842) ESLintRCWarning: You are using an eslintrc configuration file, which is deprecated and support will be removed in v10.0.0. Please migrate to an eslint.config.js file. See https://eslint.org/docs/latest/use/configure/migration-guide for details.
(Use `node --trace-warnings ...` to show where the warning was created)

> [email protected] check:dts
> node ./node_modules/typescript/bin/tsc --noEmit mongodb.d.ts && tsd


> [email protected] check:eslint
> npm run build:dts && ESLINT_USE_FLAT_CONFIG=false eslint -v && ESLINT_USE_FLAT_CONFIG=false eslint --max-warnings=0 --ext '.js,.ts' src test


> [email protected] build:dts
> npm run build:ts && api-extractor run && node etc/clean_definition_files.cjs && ESLINT_USE_FLAT_CONFIG=false eslint --no-ignore --fix mongodb.d.ts lib/beta.d.ts


> [email protected] build:ts
> node ./node_modules/typescript/bin/tsc


api-extractor 7.50.0  - https://api-extractor.com/

Using configuration from ./api-extractor.json
Analysis will use the bundled TypeScript version 5.7.2

API Extractor completed successfully
(node:86136) ESLintRCWarning: You are using an eslintrc configuration file, which is deprecated and support will be removed in v10.0.0. Please migrate to an eslint.config.js file. See https://eslint.org/docs/latest/use/configure/migration-guide for details.
(Use `node --trace-warnings ...` to show where the warning was created)
v9.9.0
(node:86162) ESLintRCWarning: You are using an eslintrc configuration file, which is deprecated and support will be removed in v10.0.0. Please migrate to an eslint.config.js file. See https://eslint.org/docs/latest/use/configure/migration-guide for details.
(Use `node --trace-warnings ...` to show where the warning was created)
(node:86163) ESLintRCWarning: You are using an eslintrc configuration file, which is deprecated and support will be removed in v10.0.0. Please migrate to an eslint.config.js file. See https://eslint.org/docs/latest/use/configure/migration-guide for details.
(Use `node --trace-warnings ...` to show where the warning was created)

> [email protected] check:tsd
> tsd --version && tsd

0.31.2

why is it not passing?

italojs added 3 commits April 14, 2025 10:36
…trieved

- Modified the id getter to require the cursor id to be present, removing the defaulting to 0.
- This change enhances the reliability of cursor handling by ensuring that a valid cursor id is always returned, improving overall response accuracy.
- Updated the id getter to throw an error if the cursor id is missing, ensuring more robust error handling.
- Adjusted the retrieval of cursor id to return 0 when it is null, improving the reliability of cursor operations.
- This change aims to enhance the accuracy of responses by enforcing the presence of a valid cursor id.
- Updated the id getter to throw an error if the cursor id is missing, ensuring more robust error handling.
- Adjusted the retrieval of cursor id to return 0 when it is null, improving the reliability of cursor operations.
- This change aims to enhance the accuracy of responses by enforcing the presence of a valid cursor id.
@baileympearson
Copy link
Contributor

Hey @italojs, sorry for the delay here. A few notes:

  • Just because you asked about lint passing - the lint task in CI also runs our unit tests, which were failing with the changes in your PR (npm run check:unit runs them locally).
  • Cursors always have ids, and I think we want to leave the existing logic that throws exceptions if id isn't present untouched.

The failures in your original implementation were caused by bson's serializer's behavior when serializing numbers. 0 serializes as an int32 (it is less than INT32_MAX and is integral), so when we attempt to parse id as an int64, an exception is thrown. Changing the id to 0n instead or new Long(0) fixes this behavior by forcing the serializer to serialize id as an int64. I confirmed this by reverting the changes to the id getter and running a CI job.

So, to summarize:

  • Can you revert any changes to the get id() getter in the response class?
  • Change the empty cursor response to { ok: 1, cursor: { id: 0n, nextBatch: [] } }?

I'm happy to make these changes, if that's easier, just let me know.

@italojs
Copy link
Contributor Author

italojs commented Apr 14, 2025

@baileympearson
I'm spending a lot of time on it; running the test locally is somewhat painful.
Meteor developers expect this fix, so if you can do it faster, please go ahead.
If possible, let me know when you push the changes.

@dariakp dariakp removed the tracked-in-jira Ticket filed in MongoDB's Jira system label Apr 16, 2025
@baileympearson baileympearson added Team Review Needs review from team and removed Primary Review In Review with primary reviewer, not yet ready for team's eyes labels Apr 16, 2025
@nbbeeken nbbeeken assigned nbbeeken and unassigned W-A-James Apr 18, 2025
@nbbeeken nbbeeken requested review from nbbeeken and removed request for W-A-James April 18, 2025 15:32
@nbbeeken nbbeeken changed the title fix(NODE-6878): ensure documents are cleared correctly to handle emptyGe… fix(NODE-6878): documents.clear() throws a TypeError after cursor is rewound Apr 18, 2025
Copy link
Contributor

@nbbeeken nbbeeken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you @italojs and @baileympearson for collaborating on this!

@nbbeeken nbbeeken removed the request for review from W-A-James April 18, 2025 16:38
@nbbeeken nbbeeken merged commit a1fffeb into mongodb:main Apr 18, 2025
26 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
External Submission PR submitted from outside the team Team Review Needs review from team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants