Smartfox Server Crashed after this log

Post here your questions about SFS2X. Here we discuss all server-side matters. For client API questions see the dedicated forums.

Moderators: Lapo, Bax

mdprasadeng
Posts: 9
Joined: 28 Apr 2020, 14:40

Smartfox Server Crashed after this log

Postby mdprasadeng » 07 Apr 2021, 15:20

Hello,

Our deployment of SFS crashed today. We see the below in the logs

Code: Select all

07 Apr 2021 | 11:46:18,626 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 8223, Type: DEFAULT, Logged: No, IP: 180.253.165.211:24245 } on Server port: 9933 <---> 24245
07 Apr 2021 | 11:46:19,212 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 8224, Type: DEFAULT, Logged: No, IP: 154.177.28.51:41820 } on Server port: 9933 <---> 41820
07 Apr 2021 | 11:46:19,265 | INFO  | SFSWorker:Sys:3 | util.executor.SmartThreadPoolExecutor |     | Added 16 new threads, current size is: 4
07 Apr 2021 | 11:46:19,265 | INFO  | SFSWorker:Sys:2 | util.executor.SmartThreadPoolExecutor |     | Added 16 new threads, current size is: 4
07 Apr 2021 | 11:46:19,266 | INFO  | SFSWorker:Sys:1 | util.executor.SmartThreadPoolExecutor |     | Added 16 new threads, current size is: 20
07 Apr 2021 | 11:46:19,266 | INFO  | SFSWorker:Sys:4 | util.executor.SmartThreadPoolExecutor |     | Added 16 new threads, current size is: 20

Just after that we have the following logged thousands of times

Code: Select all

07 Apr 2021 | 11:46:19,266 | WARN  | SFSWorker:Ext:24 | controllers.v290.ExtensionReqController |     | com.smartfoxserver.v2.exceptions.SFSExtensionException: User cannot invoke Room extension if he's not joined in that Room. ( User Name: xxxx, Id: 7787, Priv: 0, Sess: 157.47.212.59:40100 ) , [ Room: xxxxx, Id: 18, Group: games, isGame: true ]
   com.smartfoxserver.v2.controllers.v290.ExtensionReqController.processRequest(ExtensionReqController.java:145)
   com.smartfoxserver.v2.controllers.v290.ExtensionReqController$1.run(ExtensionReqController.java:68)
   java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
   java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
   java.lang.Thread.run(Unknown Source)

Followed by

Code: Select all

07 Apr 2021 | 12:00:00,249 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10646, Type: DEFAULT, Logged: No, IP: 180.253.163.253:25009 }
07 Apr 2021 | 12:00:00,250 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10647, Type: DEFAULT, Logged: No, IP: 49.34.132.159:41942 }
07 Apr 2021 | 12:00:00,250 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10648, Type: DEFAULT, Logged: No, IP: 115.178.196.62:24797 }
07 Apr 2021 | 12:00:00,250 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10649, Type: DEFAULT, Logged: No, IP: 149.109.123.30:53828 }
07 Apr 2021 | 12:00:00,251 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10650, Type: DEFAULT, Logged: No, IP: 95.111.243.69:60614 }
07 Apr 2021 | 12:00:00,251 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10651, Type: DEFAULT, Logged: No, IP: 2.176.75.225:39598 }
07 Apr 2021 | 12:00:00,251 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10652, Type: DEFAULT, Logged: No, IP: 115.178.210.98:7702 }
07 Apr 2021 | 12:00:00,251 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10653, Type: DEFAULT, Logged: No, IP: 137.97.92.84:49724 }
07 Apr 2021 | 12:00:00,252 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10654, Type: DEFAULT, Logged: No, IP: 180.251.231.19:45844 }
07 Apr 2021 | 12:00:00,252 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10655, Type: DEFAULT, Logged: No, IP: 157.51.24.114:51400 }
07 Apr 2021 | 12:00:00,252 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10656, Type: DEFAULT, Logged: No, IP: 171.49.211.192:45752 }
07 Apr 2021 | 12:00:00,252 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10657, Type: DEFAULT, Logged: No, IP: 157.34.212.17:60330 }
07 Apr 2021 | 12:00:00,253 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10658, Type: DEFAULT, Logged: No, IP: 171.51.172.184:39443 }
07 Apr 2021 | 12:00:00,253 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10659, Type: DEFAULT, Logged: No, IP: 182.3.72.213:42174 }
07 Apr 2021 | 12:00:00,253 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10660, Type: DEFAULT, Logged: No, IP: 78.190.110.9:10624 }
07 Apr 2021 | 12:00:00,253 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10661, Type: DEFAULT, Logged: No, IP: 46.106.82.234:20670 }
07 Apr 2021 | 12:00:00,253 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10662, Type: DEFAULT, Logged: No, IP: 122.164.46.187:45286 }
07 Apr 2021 | 12:00:00,254 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10663, Type: DEFAULT, Logged: No, IP: 114.31.188.51:44913 }
07 Apr 2021 | 12:00:00,254 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10664, Type: DEFAULT, Logged: No, IP: 49.15.236.149:56997 }
07 Apr 2021 | 12:00:00,254 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10665, Type: DEFAULT, Logged: No, IP: 116.68.69.79:23563 }
07 Apr 2021 | 12:00:00,254 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10666, Type: DEFAULT, Logged: No, IP: 103.26.85.219:31776 }
07 Apr 2021 | 12:00:00,254 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10667, Type: DEFAULT, Logged: No, IP: 157.40.182.62:41466 }
07 Apr 2021 | 12:00:00,255 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10668, Type: DEFAULT, Logged: No, IP: 49.36.169.220:50090 }
07 Apr 2021 | 12:00:00,255 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10669, Type: DEFAULT, Logged: No, IP: 88.155.24.129:10468 }
07 Apr 2021 | 12:00:00,255 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10670, Type: DEFAULT, Logged: No, IP: 37.4.227.142:9679 }
07 Apr 2021 | 12:00:00,255 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10671, Type: DEFAULT, Logged: No, IP: 106.195.117.232:51556 }
07 Apr 2021 | 12:00:00,256 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10672, Type: DEFAULT, Logged: No, IP: 117.230.3.53:5208 }
07 Apr 2021 | 12:00:00,256 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10673, Type: DEFAULT, Logged: No, IP: 116.226.35.218:52847 }
07 Apr 2021 | 12:00:00,256 | INFO  | Scheduler1-thread-1 | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 10674, Type: DEFAULT, Logged: No, IP: 1.38.52.32:65198 }
07 Apr 2021 | 12:00:00,372 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 7863, Type: DEFAULT, Logged: Yes, IP: 114.122.139.178:39270 }
07 Apr 2021 | 12:00:00,373 | INFO  | SFSWorker:Sys:2 | v2.api.SFSApi |     | User disconnected: { Zone: DevilAmongstUsV2 }, ( User Name: dau-79487c50-2ae6-d69a-4c59-99eb58844118, Id: 7592, Priv: 0, Sess: 114.122.139.178:39270 ) , SessionLen: 1413708, Type: Unity / .Net
07 Apr 2021 | 12:00:00,924 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10696, Type: DEFAULT, Logged: No, IP: 78.190.110.9:9732 } on Server port: 9933 <---> 9732
07 Apr 2021 | 12:00:01,510 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session removed: { Id: 8056, Type: DEFAULT, Logged: Yes, IP: 182.1.203.103:50016 }
07 Apr 2021 | 12:00:06,297 | INFO  | SocketReader | bitswarm.core.SocketReader |     | Socket closed: java.nio.channels.SocketChannel[closed]
07 Apr 2021 | 12:00:06,297 | INFO  | SFSWorker:Sys:2 | v2.api.SFSApi |     | User disconnected: { Zone: DevilAmongstUsV2 }, ( User Name: dau-3a050222-7bbe-1f5a-e55d-274b0e67c627, Id: 7778, Priv: 0, Sess: 182.1.203.103:50016 ) , SessionLen: 1079069, Type: Unity / .Net
07 Apr 2021 | 12:00:06,303 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10697, Type: DEFAULT, Logged: No, IP: 114.31.188.51:44875 } on Server port: 9933 <---> 44875
07 Apr 2021 | 12:00:06,303 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10698, Type: DEFAULT, Logged: No, IP: 171.51.172.184:39455 } on Server port: 9933 <---> 39455
07 Apr 2021 | 12:00:06,303 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10699, Type: DEFAULT, Logged: No, IP: 149.109.123.30:53874 } on Server port: 9933 <---> 53874
07 Apr 2021 | 12:00:06,303 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10700, Type: DEFAULT, Logged: No, IP: 157.51.24.114:51401 } on Server port: 9933 <---> 51401
07 Apr 2021 | 12:00:06,304 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10701, Type: DEFAULT, Logged: No, IP: 171.49.211.192:45754 } on Server port: 9933 <---> 45754
07 Apr 2021 | 12:00:06,304 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10702, Type: DEFAULT, Logged: No, IP: 157.34.212.17:60334 } on Server port: 9933 <---> 60334
07 Apr 2021 | 12:00:06,304 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10703, Type: DEFAULT, Logged: No, IP: 37.4.227.142:2366 } on Server port: 9933 <---> 2366
07 Apr 2021 | 12:00:06,305 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10704, Type: DEFAULT, Logged: No, IP: 117.230.3.53:4912 } on Server port: 9933 <---> 4912
07 Apr 2021 | 12:00:06,305 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10705, Type: DEFAULT, Logged: No, IP: 103.26.85.219:31785 } on Server port: 9933 <---> 31785
07 Apr 2021 | 12:00:06,305 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10706, Type: DEFAULT, Logged: No, IP: 49.36.169.220:50091 } on Server port: 9933 <---> 50091
07 Apr 2021 | 12:00:06,305 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10707, Type: DEFAULT, Logged: No, IP: 95.13.159.136:57041 } on Server port: 9933 <---> 57041
07 Apr 2021 | 12:00:06,306 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10708, Type: DEFAULT, Logged: No, IP: 157.47.87.32:50866 } on Server port: 9933 <---> 50866
07 Apr 2021 | 12:00:06,306 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10709, Type: DEFAULT, Logged: No, IP: 180.252.113.150:58382 } on Server port: 9933 <---> 58382
07 Apr 2021 | 12:00:06,306 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10710, Type: DEFAULT, Logged: No, IP: 180.253.163.253:43465 } on Server port: 9933 <---> 43465
07 Apr 2021 | 12:00:06,306 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10711, Type: DEFAULT, Logged: No, IP: 109.201.106.142:52168 } on Server port: 9933 <---> 52168
07 Apr 2021 | 12:00:06,307 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10712, Type: DEFAULT, Logged: No, IP: 157.32.112.141:33920 } on Server port: 9933 <---> 33920
07 Apr 2021 | 12:00:06,307 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10713, Type: DEFAULT, Logged: No, IP: 60.254.61.81:56330 } on Server port: 9933 <---> 56330
07 Apr 2021 | 12:00:06,307 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10714, Type: DEFAULT, Logged: No, IP: 36.85.216.27:19327 } on Server port: 9933 <---> 19327
07 Apr 2021 | 12:00:06,488 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10715, Type: DEFAULT, Logged: No, IP: 106.195.117.232:53708 } on Server port: 9933 <---> 53708
07 Apr 2021 | 12:00:06,559 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10716, Type: DEFAULT, Logged: No, IP: 116.68.69.79:13371 } on Server port: 9933 <---> 13371
07 Apr 2021 | 12:00:06,922 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10717, Type: DEFAULT, Logged: No, IP: 182.3.72.213:42204 } on Server port: 9933 <---> 42204
07 Apr 2021 | 12:00:06,939 | INFO  | SocketReader | bitswarm.sessions.DefaultSessionManager |     | Session created: { Id: 10718, Type: DEFAULT, Logged: No, IP: 46.106.82.234:20254 } on Server port: 9933 <---> 20254


After which we restarted the instance.

From what I saw the thread count went beyond 400 when its generally around 90.

We are using Smartfox 2.13.7.

Has anyone faced anything similar ? Any pointers on how to debug the root cause?

Regards,
Durga Prasad.
User avatar
Lapo
Site Admin
Posts: 23025
Joined: 21 Mar 2005, 09:50
Location: Italy

Re: Smartfox Server Crashed after this log

Postby Lapo » 07 Apr 2021, 16:07

Hi,
do you know the exact cause of the crash? Was it an OutOfMemoryException? You should find more details in the logs.

These messages:
07 Apr 2021 | 11:46:19,265 | INFO | SFSWorker:Sys:3 | util.executor.SmartThreadPoolExecutor | | Added 16 new threads, current size is: 4
07 Apr 2021 | 11:46:19,265 | INFO | SFSWorker:Sys:2 | util.executor.SmartThreadPoolExecutor | | Added 16 new threads, current size is: 4
07 Apr 2021 | 11:46:19,266 | INFO | SFSWorker:Sys:1 | util.executor.SmartThreadPoolExecutor | | Added 16 new threads, current size is: 20
07 Apr 2021 | 11:46:19,266 | INFO | SFSWorker:Sys:4 | util.executor.SmartThreadPoolExecutor | | Added 16 new threads, current size is: 20

are saying that the system is adding more threads for the Extension(s) because all current thread are already busy.
This in turn means that every Extension thread is taking a significant amount of time to complete its task before returning to the thread pool which can be symptom of a bottleneck: such as a slow DB connection, or other I/O that you might be running in your Extension code.

Are you monitoring the server via the AdminTool or the OS, to check whether CPU and memory usage are under control?

Let us know.
Lapo
--
gotoAndPlay()
...addicted to flash games
mdprasadeng
Posts: 9
Joined: 28 Apr 2020, 14:40

Re: Smartfox Server Crashed after this log

Postby mdprasadeng » 09 Apr 2021, 06:52

Hey,
Thanks for the reply. I still don't know if there are any slow IO operations going on. (We only have IO operations on login and that service didn't become slow).

The headpdump on crash reveals the following
A Linked List Data Structure Detected.
2,221,979,600 bytes (92.95 %) of Java heap is used by 794,146 instances of java/util/concurrent/LinkedBlockingQueue$Node
Contains 2 types of the following objects:
- 3,559 instances of com/smartfoxserver/v2/core/SFSEventManager$SFSEventRunner holding 8,323,576 bytes
- 790,586 instances of com/smartfoxserver/v2/controllers/v290/ExtensionReqController$1 holding 2,188,243,320 bytes
Contains an instance) of the leak suspect:
- com/smartfoxserver/v2/SmartFoxServer holding 684,432,144 bytes at 0x3c0007160


It might be a side effect of some slow IO but still SFS should ideally should not keep spawning these many ExtensionReqControllers right? Am I missing a config that can control the amount of RequestLoggers right?
The config we use is in the attached image.
sfs-config.png
(122.31 KiB) Not downloaded yet

Regards,
Durga
User avatar
Lapo
Site Admin
Posts: 23025
Joined: 21 Mar 2005, 09:50
Location: Italy

Re: Smartfox Server Crashed after this log

Postby Lapo » 09 Apr 2021, 09:24

The heap dump is showing that the Extension Queue is full of requests that should be dealt with but the system is not keeping up.
Either your server hardware is unable to keep up with the amount of requests, which can be easily verified by checking the CPU % usage, or there is a bottleneck in your Extension, holding threads for a long time and lagging behind while the requests keep piling up in the main Extension queue.
In this second case you would just see the Extension queue filling up while CPU usage is relatively low.

If you monitor SFS2X at runtime via its AdminTool's Dashboard you can get a much clearer picture of what is going on, as the tool allows you to monitor CPU, heap memory, threads and queue states.

Also you can take a look at this guide:
http://docs2x.smartfoxserver.com/Advanc ... ive-server

Hope it helps
Lapo

--

gotoAndPlay()

...addicted to flash games
mdprasadeng
Posts: 9
Joined: 28 Apr 2020, 14:40

Re: Smartfox Server Crashed after this log

Postby mdprasadeng » 10 Apr 2021, 16:37

Thanks Lapo, for the response.

I am monitoring via the admin tool, the cpu and memory are well within limits but its true that the Extension Queue is full of requests.

There is a direct corelation between these errors

Code: Select all

v290.ExtensionReqController |     | com.smartfoxserver.v2.exceptions.SFSExtensionException: User cannot invoke Room extension if he's not joined in that Room


And a memory leak in v290/ExtensionReqController.

While we are trying to understand what is making the client send such requests, its taking time to find the issue on the client.

Is the memory leak in v290/ExtensionReqController a known issue? Any solutions for this, like upgrading the SFS version (right now at 2.13.7)

Regards,
Durga Prasad
User avatar
Lapo
Site Admin
Posts: 23025
Joined: 21 Mar 2005, 09:50
Location: Italy

Re: Smartfox Server Crashed after this log

Postby Lapo » 12 Apr 2021, 11:08

A memory leak is a situation where unwanted objects remain hanging in memory, where they should instead be disposed of.

This is a different situation: here the server is keeping these objects in memory (and they do accumulate) because they need to be processed. So this is not garbage data.

The problem is not the large amount of events in memory, but rather finding the cause of the overall slowdown, where threads (even after multiple backups are added) are not able to keep up with the requests.
Either there's too much events sent by clients (i.e. too fast packet rate), which seems weird but no impossible, or rather there's an issue with how fast each Extension thread are working, which is what the log files are pointing to.

I hope this clarifies the situation.

Cheers
Lapo

--

gotoAndPlay()

...addicted to flash games

Return to “SFS2X Questions”

Who is online

Users browsing this forum: No registered users and 67 guests