[GH-ISSUE #49] Use more than one CPU #36

Closed
opened 2026-05-23 08:27:27 -06:00 by gitea-mirror · 16 comments
Owner

Originally created by @KesleyDavid on GitHub (Nov 29, 2021).
Original GitHub issue: https://github.com/appy-one/acebase/issues/49

Good morning,
I'm implementing a system with the acebase database, but we're facing a problem where the system makes several general listing queries, and some of these collections have 26 thousand records.

We put it on a server with 10vCPU, however we observed that the database only uses 1vCPU at a time. We are using pm2.
We tried to put it in cluster mode, but without success.

Is there any solution or alternative for us to take advantage of all the server's vCPU? Because with only 1vCPU, when we make two or three large simultaneous requests, the database crashes for a long time.

image

image

Originally created by @KesleyDavid on GitHub (Nov 29, 2021). Original GitHub issue: https://github.com/appy-one/acebase/issues/49 Good morning, I'm implementing a system with the acebase database, but we're facing a problem where the system makes several general listing queries, and some of these collections have 26 thousand records. We put it on a server with 10vCPU, however we observed that the database only uses 1vCPU at a time. We are using pm2. We tried to put it in cluster mode, but without success. Is there any solution or alternative for us to take advantage of all the server's vCPU? Because with only 1vCPU, when we make two or three large simultaneous requests, the database crashes for a long time. ![image](https://user-images.githubusercontent.com/39314443/143944010-b612e902-2495-418e-9f27-a722e501bca7.png) ![image](https://user-images.githubusercontent.com/39314443/143944126-77331663-9db6-4488-ad26-54ece38d57cd.png)
Author
Owner

@appy-one commented on GitHub (Nov 29, 2021):

Hi Kesley, good evening at this side of the globe!

Using multiple CPUs is definitely possible using nodejs clustering or pm2, but you shouldn't have to if you add indexes. With an index (or multiple), you'll be able to get query results in milliseconds because it won't have to plow through all reords to check if they match your criteria. I'm happy to help you with any troubles you run into using multiple threads, but I think you should check out indexing first, Info about indexes can be found here. Let me know if you need more info, or use indexes already

<!-- gh-comment-id:982056030 --> @appy-one commented on GitHub (Nov 29, 2021): Hi Kesley, good evening at this side of the globe! Using multiple CPUs is definitely possible using nodejs clustering or pm2, but you shouldn't have to if you add indexes. With an index (or multiple), you'll be able to get query results in milliseconds because it won't have to plow through all reords to check if they match your criteria. I'm happy to help you with any troubles you run into using multiple threads, but I think you should check out indexing first, Info about indexes can be found [here](https://github.com/appy-one/acebase#indexing-data). Let me know if you need more info, or use indexes already
Author
Owner

@KesleyDavid commented on GitHub (Nov 30, 2021):

Good morning, thanks for the information.
Actually, the creation of indices would be the most suitable, but we are migrating from firebase to acebase, and due to time, I just "converted" the system searches to acebase.
To create the indices, I will need to refactor all query queries, and change the way the data is grouped in the system.

I'm thinking about creating the cluster initially, for the system to work. And over time, I'm refactoring each code and adding indices to gain performance.

do you know what it could be?

I'm getting an error when creating the cluster with pm2, the prints follow:

image

module.exports = {
  apps : [{
    name: 'server-database',
    script: 'server-database.js',
    watch: true,
    exec_mode: 'cluster',
    instances: '2',
    ignore_watch: ["node_modules"],
  }],
};
const { AceBaseServer } = require('acebase-server');

const server = new AceBaseServer('dbREX', { 
  host: 'localhost', 
  port: 5757,
  authentication: {
    enabled: true,
    allowUserSignup: false,
    defaultAccessRule: 'auth',
  }
});

server.on("ready", () => {
    console.log("SERVER ready");
});

Code:
pm2 start ecosystem.config.js

LOGs:
image

CLIENT:

const ACEBASE = new AceBaseClient({ host: 'localhost', port: 5757, dbname: 'dbREX', https: false });
ACEBASE.ready(() => {
    console.log('Connected successfully');
});

image

<!-- gh-comment-id:982607369 --> @KesleyDavid commented on GitHub (Nov 30, 2021): Good morning, thanks for the information. Actually, the creation of indices would be the most suitable, but we are migrating from firebase to acebase, and due to time, I just "converted" the system searches to acebase. To create the indices, I will need to refactor all query queries, and change the way the data is grouped in the system. I'm thinking about creating the cluster initially, for the system to work. And over time, I'm refactoring each code and adding indices to gain performance. do you know what it could be? I'm getting an error when creating the cluster with pm2, the prints follow: ![image](https://user-images.githubusercontent.com/39314443/144049980-85a285c7-59bf-4c68-9e42-1f7c6c4b6c1a.png) ```js module.exports = { apps : [{ name: 'server-database', script: 'server-database.js', watch: true, exec_mode: 'cluster', instances: '2', ignore_watch: ["node_modules"], }], }; ``` ```js const { AceBaseServer } = require('acebase-server'); const server = new AceBaseServer('dbREX', { host: 'localhost', port: 5757, authentication: { enabled: true, allowUserSignup: false, defaultAccessRule: 'auth', } }); server.on("ready", () => { console.log("SERVER ready"); }); ``` Code: `pm2 start ecosystem.config.js` LOGs: ![image](https://user-images.githubusercontent.com/39314443/144050348-75436f2a-88bb-4435-985a-9528e121bd3e.png) CLIENT: ```js const ACEBASE = new AceBaseClient({ host: 'localhost', port: 5757, dbname: 'dbREX', https: false }); ACEBASE.ready(() => { console.log('Connected successfully'); }); ``` ![image](https://user-images.githubusercontent.com/39314443/144050606-8ee2375c-07ef-4b2c-ad9a-502bb9f43a99.png)
Author
Owner

@appy-one commented on GitHub (Nov 30, 2021):

I did some additional research, you can't use pm2 for clustering at the moment. In a pm2 cluster there is no master process (pm2 itself is the master), so the IPC between the threads doesn't work. The different processes have to be able to "talk" to each other because they read and write to the same database file, and them accessing the data independently would cause corruption. You can use Node.js's native clustering functionality, but you'll have to fork the child processes yourself then.

I strongly encourage you to take a look at indexing first, I'm working on a solution for pm2.

<!-- gh-comment-id:982718969 --> @appy-one commented on GitHub (Nov 30, 2021): I did some additional research, you can't use pm2 for clustering at the moment. In a pm2 cluster there is no master process (pm2 itself is the master), so the IPC between the threads doesn't work. The different processes have to be able to "talk" to each other because they read and write to the same database file, and them accessing the data independently would cause corruption. You can use Node.js's native clustering functionality, but you'll have to fork the child processes yourself then. I strongly encourage you to take a look at indexing first, I'm working on a solution for pm2.
Author
Owner

@KesleyDavid commented on GitHub (Nov 30, 2021):

thanks for the explanations. I started to create some indexes in the database, the performance gain was really high. However, on the dashboard screen for example, the entire server still crashes, and other users cannot access data, as the main process is being occupied by the cpu.
the cluster with pm2 I believe would be the most suitable in the current situation

<!-- gh-comment-id:982806106 --> @KesleyDavid commented on GitHub (Nov 30, 2021): thanks for the explanations. I started to create some indexes in the database, the performance gain was really high. However, on the dashboard screen for example, the entire server still crashes, and other users cannot access data, as the main process is being occupied by the cpu. the cluster with pm2 I believe would be the most suitable in the current situation
Author
Owner

@appy-one commented on GitHub (Dec 1, 2021):

I understand your need for multiple processes, the only way to do that now is by bypassing pm2 and forking the process yourself. I'm working on an IPC implementation that will use an external server for communication between isolated (pm2) processes, but that obviously will take some time.

The very best you can do now is investigate what is causing that extreme load on your server process. Is your dashboard requesting too much data, too frequently? For example, if you are using value events on large data trees, please keep in mind that every change anywhere in the tree will require the database to load all data being monitored in order to provide new/previous value pairs for the events to fire. For example, if you have a: db.ref('orders').on('value', callback), each tiny mutation to any order (such as db.ref('orders/order553/shipped').set(true)) will trigger ALL orders to be loaded from db, create previous/new value pairs for, and sent over the network. For monitoring data changes in large collections it's better to use notify_value, mutated or mutations events. Let me know if you think the load is caused by other/unexpected causes, I'm happy to investigate and help.

<!-- gh-comment-id:983410929 --> @appy-one commented on GitHub (Dec 1, 2021): I understand your need for multiple processes, the only way to do that now is by bypassing pm2 and forking the process yourself. I'm working on an IPC implementation that will use an external server for communication between isolated (pm2) processes, but that obviously will take some time. The very best you can do now is investigate what is causing that extreme load on your server process. Is your dashboard requesting too much data, too frequently? For example, if you are using `value` events on large data trees, please keep in mind that every change anywhere in the tree will require the database to load all data being monitored in order to provide new/previous value pairs for the events to fire. For example, if you have a: `db.ref('orders').on('value', callback)`, each tiny mutation to any order (such as `db.ref('orders/order553/shipped').set(true)`) will trigger ALL orders to be loaded from db, create previous/new value pairs for, and sent over the network. For monitoring data changes in large collections it's better to use `notify_value`, `mutated` or `mutations` events. Let me know if you think the load is caused by other/unexpected causes, I'm happy to investigate and help.
Author
Owner

@appy-one commented on GitHub (Dec 15, 2021):

I've been working on the clustering functionality, it is now possible to create pm2 and cloud-based AceBaseServer clusters!
Check out the new AceBase IPC server repository, its documentation describes in detail how to setup a cluster of AceBase servers. All is obviously brand spanking new and might have issues, would really appreciate if you'd want to help testing!

<!-- gh-comment-id:994757997 --> @appy-one commented on GitHub (Dec 15, 2021): I've been working on the clustering functionality, it is now possible to create pm2 and cloud-based AceBaseServer clusters! Check out the new [AceBase IPC server](https://github.com/appy-one/acebase-ipc-server) repository, its documentation describes in detail how to setup a cluster of AceBase servers. All is obviously brand spanking new and might have issues, would really appreciate if you'd want to help testing!
Author
Owner

@KesleyDavid commented on GitHub (Dec 15, 2021):

@appy-one
Good morning, very good, and thank you for the initiative.
I'm going to start testing, and I hope to be able to contribute to the project as well.

<!-- gh-comment-id:994875507 --> @KesleyDavid commented on GitHub (Dec 15, 2021): @appy-one Good morning, very good, and thank you for the initiative. I'm going to start testing, and I hope to be able to contribute to the project as well.
Author
Owner

@KesleyDavid commented on GitHub (Dec 16, 2021):

@appy-one,
I am encountering some issues when testing the cluster.
Even with these errors, I can consult the data in the database, but apparently the cluester has something wrong.
Do you know what it could be?
I created a repository with the entire environment for us to test

Repository:
https://github.com/KesleyDavid/test-db-acebase-cluster

Error:
image

<!-- gh-comment-id:995908019 --> @KesleyDavid commented on GitHub (Dec 16, 2021): @appy-one, I am encountering some issues when testing the cluster. Even with these errors, I can consult the data in the database, but apparently the cluester has something wrong. Do you know what it could be? I created a repository with the entire environment for us to test Repository: https://github.com/KesleyDavid/test-db-acebase-cluster Error: ![image](https://user-images.githubusercontent.com/39314443/146396990-eb5192fa-8c9c-4aa7-a21f-9d2f1ce32935.png)
Author
Owner

@appy-one commented on GitHub (Dec 16, 2021):

Ah, I see. I didn't test the websocket connection, assumed that would just work. (🙈) I googeled and apparently socket.io is not able to connect to a pm2 clustered server using long-polling. You could disable long-polling on the client so it'll only try connecting through websockets but then there might be other state-related issues. The websocket between client and server is mainly used for event notifications and that might become an issue in such cluster. I'll dive into it!

<!-- gh-comment-id:995997015 --> @appy-one commented on GitHub (Dec 16, 2021): Ah, I see. I didn't test the websocket connection, assumed that would just work. (🙈) I googeled and apparently socket.io is not able to connect to a pm2 clustered server using long-polling. You could disable long-polling on the client so it'll only try connecting through websockets but then there might be other state-related issues. The websocket between client and server is mainly used for event notifications and that might become an issue in such cluster. I'll dive into it!
Author
Owner

@KesleyDavid commented on GitHub (Dec 16, 2021):

I get it now, it's really too much detail to get the database into a working cluster.

I'm also researching and doing some tests

<!-- gh-comment-id:996033450 --> @KesleyDavid commented on GitHub (Dec 16, 2021): I get it now, it's really too much detail to get the database into a working cluster. I'm also researching and doing some tests
Author
Owner

@appy-one commented on GitHub (Dec 17, 2021):

I updated acebase-client (v1.10.1), it now only uses websocket transport to connect to the server (disabled long-polling). This should fix the connection issue, and as far as I can see will not have any negative side-effects.

A websocket connection from client to server will stay connected to the same server process so there should be no issues with state. Data updates and retrieval use regular http requests so they will be load balanced in your cluster; event notifications are sent over the websocket connection, and their subscriptions are state-managed within the AceBase cluster through IPC so this should also work ok.

Let me know if this fixes it!

<!-- gh-comment-id:996608900 --> @appy-one commented on GitHub (Dec 17, 2021): I updated acebase-client (v1.10.1), it now only uses websocket transport to connect to the server (disabled long-polling). This should fix the connection issue, and as far as I can see will not have any negative side-effects. A websocket connection from client to server will stay connected to the same server process so there should be no issues with state. Data updates and retrieval use regular http requests so they will be load balanced in your cluster; event notifications are sent over the websocket connection, and their subscriptions are state-managed within the AceBase cluster through IPC so this should also work ok. Let me know if this fixes it!
Author
Owner

@KesleyDavid commented on GitHub (Dec 17, 2021):

thank you, i will perform the tests

<!-- gh-comment-id:997030191 --> @KesleyDavid commented on GitHub (Dec 17, 2021): thank you, i will perform the tests
Author
Owner

@KesleyDavid commented on GitHub (Dec 17, 2021):

Good night, the workload is being balanced evenly by request to the database. I'll run more tests over the weekend and update the results here.

image

<!-- gh-comment-id:997046357 --> @KesleyDavid commented on GitHub (Dec 17, 2021): Good night, the workload is being balanced evenly by request to the database. I'll run more tests over the weekend and update the results here. ![image](https://user-images.githubusercontent.com/39314443/146610925-2e47f455-910c-40bd-81ff-3f1345fa0b30.png)
Author
Owner

@KesleyDavid commented on GitHub (Dec 18, 2021):

The cluester apparently works perfectly, however I have a problem to create indices in the database, I run the command with an admin user of the database, but nothing happens and the index is not created.

I tested a table with several items and it didn't work. Then I tested it on a table with only one item, and even so the code stops when I try to add an index

<!-- gh-comment-id:997211064 --> @KesleyDavid commented on GitHub (Dec 18, 2021): The cluester apparently works perfectly, however I have a problem to create indices in the database, I run the command with an admin user of the database, but nothing happens and the index is not created. I tested a table with several items and it didn't work. Then I tested it on a table with only one item, and even so the code stops when I try to add an index
Author
Owner

@appy-one commented on GitHub (Dec 20, 2021):

Thanks, I'll test the indexing when running in a cluster

<!-- gh-comment-id:997749702 --> @appy-one commented on GitHub (Dec 20, 2021): Thanks, I'll test the indexing when running in a cluster
Author
Owner

@appy-one commented on GitHub (Dec 20, 2021):

I published the indexing issue fix in acebase v1.12.3

<!-- gh-comment-id:997953359 --> @appy-one commented on GitHub (Dec 20, 2021): I published the indexing issue fix in acebase v1.12.3
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/acebase#36
No description provided.