在不超出堆棧限制的情況下迭代或遞歸大量函數的最佳方法是什么?

[英]What is the best way to iterate or recurse through huge amounts of huge functions without exceeding the stack limit?


I have an application that I'm writing in Node.js which needs to make a lot of configuration and database calls in order to process user data. The issue I'm having is that after 11,800+ function calls Node will throw an error and exit the process.

我有一個應用程序,我正在Node.js編寫,需要進行大量的配置和數據庫調用,以處理用戶數據。我遇到的問題是,在11,800多個函數調用之后,Node會拋出錯誤並退出進程。

The error says: RangeError: Maximum call stack size exceeded

錯誤說明:RangeError:超出最大調用堆棧大小

I'm curious if anyone else has had this situation arise and to know how they handled this. I've already started to break up my code into a couple of extra worker files but even so each time I process a data node it needs to touch 2 databases (at most 25 calls to update various tables) and do a number of sanitization checks.

我很好奇是否有其他人已經出現這種情況,並知道他們是如何處理這個問題的。我已經開始將我的代碼分解為幾個額外的工作文件,但即便如此,每次處理數據節點時,它都需要觸摸2個數據庫(最多25次調用以更新各種表)並進行一些清理檢查。

I am totally willing to admit that I'm possibly doing something non-optimal if that is the case but would appreciate some guidance if there is a more optimal manner.

我完全願意承認,如果是這種情況,我可能會做一些非最優的事情,但如果有更優化的方式,我會很感激。

Here is an example of the code I'm running on data:

以下是我在數據上運行的代碼示例:

app.post('/initspeaker', function(req, res) {
    // if the Admin ID is not present ignore
    if(req.body.xyzid!=config.adminid) {
        res.send( {} );
        return;
    }

    var gcnt = 0, dbsize = 0, goutput = [], goutputdata = [], xyzuserdataCallers = [];

    xyz.loadbatchfile( xyz.getbatchurl("speakers", "csv"), function(data) {
        var parsed = csv.parse(data);
        console.log("lexicon", parsed[0]);

        for(var i=1;i<parsed.length;i++) {
            if(typeof parsed[i][0] != 'undefined' && parsed[i][0]!='name') {
                var xyzevent = require('./lib/model/xyz_speaker').create(parsed[i], parsed[0]);
                xyzevent.isPresenter = true;
                goutput.push(xyzevent);
            }
        }
        dbsize = goutput.length;

        xyzuserdataCallers = [new xyzuserdata(),
                                    new xyzuserdata(),
                                    new xyzuserdata(),
                                    new xyzuserdata(),
                                    new xyzuserdata(),
                                    new xyzuserdata(),
                                    new xyzuserdata(),
                                    new xyzuserdata()
                                ];
        // insert all Scheduled Items into the DB                   
        xyzuserdataCallers[0].sendSpeakerData(goutput[0]);
        for(var i=1;i<xyzuserdataCallers;i++) {
            xyzuserdataCallers[i].sendSpeakerData(8008);
        }

        //sendSpeakerData(goutput[0]);
    });

    var callback = function(data, func) {
        //console.log(data);
        if(data && data!=8008) {
            if(gcnt>=dbsize) {
                res.send("done");
            } else {
                gcnt++;
                func.sendSpeakerData(goutput[gcnt]);
            }
        } else {
            gcnt++;
            func.sendSpeakerData(goutput[gcnt]);
        }
    };

    // callback loop for fetching registrants for events from SMW
    var xyzuserdata = function() {};
    xyzuserdata.prototype.sendSpeakerData = function(data) {
        var thisfunc = this;

        if(data && data!=8008) {
            //console.log('creating user from data', gcnt, dbsize);
            var userdata = require('./lib/model/user').create(data.toObject());
            var speakerdata = userdata.toObject();
            speakerdata.uid = uuid.v1();
            speakerdata.isPresenter = true;

            couchdb.insert(speakerdata, config.couch.db.user, function($data) {
                if($data==false) {
                    // if this fails it is probably due to a UID colliding
                    console.log("*** trying user data again ***");
                    speakerdata.uid = uuid.v1();
                    arguments.callee( speakerdata );
                } else {
                    callback($data, thisfunc);
                }
            });
        } else {
            gcnt++;
            arguments.callee(goutput[gcnt]);
        }
    };

});

A couple of classes and items are defined here that need some introduction:

這里定義了幾個類和項目需要一些介紹:

  • I am using Express.js + hosted CouchDB and this is responding to a POST request
  • 我正在使用Express.js +托管的CouchDB,這是響應POST請求

  • There is a CSV parser class that loads a list of events which drives pulling speaker data
  • 有一個CSV解析器類可以加載一個驅動揚聲器數據的事件列表

  • Each event can have n number of users (currently around 8K users for all events)
  • 每個活動可以有n個用戶(目前所有活動的用戶約為8K)

  • I'm using a pattern that loads all of the data/users before attempting to parse any of them
  • 我正在使用一種模式,在嘗試解析任何數據/用戶之前加載所有數據/用戶

  • Each user loaded (external data source) is converted into an object I can use and also sanitized (strip slashes and such)
  • 每個加載的用戶(外部數據源)都會轉換為我可以使用的對象,也會被清理(條帶斜線等)

  • Each user is then inserted into CouchDB
  • 然后將每個用戶插入CouchDB

This code works in the app but after a while I get an error saying that over 11,800+ calls have been made and the app breaks. This isn't an error that contains a stack trace like one would see if it was code error, it is exiting due to the number of calls being done.

此代碼在應用程序中有效,但過了一段時間后,我收到一條錯誤消息,說已經有超過11,800多個來電並且應用程序中斷了。這不是包含堆棧跟蹤的錯誤,如果它是代碼錯誤就會看到它,由於調用次數正在退出。

Again, any assistance/commentary/direction would be appreciated.

再次,任何協助/評論/指示將不勝感激。

2 个解决方案

#1


5  

It looks like xyzuserdata.sendSpeakerData & callback are being used recursively in order to keep the DB calls sequential. At some point you run out of call stack...

它看起來像遞歸使用xyzuserdata.sendSpeakerData和回調,以保持DB調用順序。在某些時候你用完了電話堆棧......

There's several modules to make serial execution easier, like Step or Flow-JS.

有幾個模塊可以簡化串行執行,比如Step或Flow-JS。

Flow-JS even has a convenience function to apply a function serially over the elements of the array:

Flow-JS甚至還具有一個便利功能,可以在數組元素上串行應用函數:

flow.serialForEach(goutput, xyzuserdata.sendSpeakerData, ...)

I wrote a small test program using flow.serialForEach, but unfortunately was able to get a Maximum call stack size exceeded error -- Looks like Flow-JS is using the call stack in a similar way to keep things in sync.

我使用flow.serialForEach編寫了一個小測試程序,但遺憾的是能夠獲得超出最大調用堆棧大小的錯誤 - 看起來像Flow-JS以類似的方式使用調用堆棧來保持同步。

Another approach that doesn't build up the call stack is to avoid recursion and use setTimeout with a timeout value of 0 to schedule the callback call. See http://metaduck.com/post/2675027550/asynchronous-iteration-patterns-in-node-js

另一種不構建調用堆棧的方法是避免遞歸並使用超時值為0的setTimeout來調度回調調用。見http://metaduck.com/post/2675027550/asynchronous-iteration-patterns-in-node-js

You could try replacing the callback call with

您可以嘗試使用替換回調調用

setTimeout(callback, 0, [$data, thisfunc])

#2


1  

Recursion is very useful for synchronizing async operations -- that's why it is used in flow.js etc.

遞歸對於同步異步操作非常有用 - 這就是它在flow.js等中使用的原因。

However if you want to process an unlimited number of elements in an array, or buffered stream, you will need to use node.js's event emitter.

但是,如果要在數組或緩沖流中處理無限數量的元素,則需要使用node.js的事件發射器。

in pseudo-ish-code:

 ee = eventemitter
 arr = A_very_long_array_to_process
 callback = callback_to_call_once_either_with_an_error_or_when_done

 // the worker function does everything
 processOne() {
   var 
      next = arr. shift();
   if( !arr )
      ee.emit ( 'finished' )
      return

   process( function( err, response) {
      if( err )
         callback( err, response )
      else
         ee.emit( 'done-one' )
    } );
 }    

 // here we process the final event that the worker will throw when done
 ee.on( 'finished', function() { callback( null, 'we processed the entire array!'); } );

 // here we say what to do after one thing has been processed
 ee.on( 'done-one', function() { processOne(); } );

 // here we get the ball rolling
 processOne();

注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2012/02/01/720491f03c6533a7a1dd61eb3da8962b.html



 
粤ICP备14056181号  © 2014-2020 ITdaan.com