SFS2X behavior at bandwidth saturation

Post here your questions about SFS2X. Here we discuss all server-side matters. For client API questions see the dedicated forums.

Moderators: Lapo, Bax

tchen
Posts: 191
Joined: 11 Dec 2010, 14:14

SFS2X behavior at bandwidth saturation

Postby tchen » 09 Feb 2011, 16:29

I am doing some load testing and am noticing that the client driver doesn't recover gracefully after experiencing packet loss. Maybe I'm doing something wrong, but my crude test consists of getting SFS's built in public message system to reflect inbound messages.

Image[/img]

At about 60+ CCU on my test rig, the outbound link on the VPS is saturated and starts to experience dropped outbound packets. At this point, some of the .NET clients throw the exception below and the disconnected bit is set (but not the events).

Code: Select all

[SFS - ERROR] TCPSocketLayer: General error reading data from socket: Invalid SFSDataType. Expected: SFS_OBJECT, found: 66   at Sfs2X.Protocol.Serialization.DefaultSFSDataSerializer.DecodeSFSObject (Sfs2X.Util.ByteArray buffer) [0x00000]
  at Sfs2X.Protocol.Serialization.DefaultSFSDataSerializer.Binary2Object (Sfs2X.Util.ByteArray data) [0x00000]
  at Sfs2X.Entities.Data.SFSObject.NewFromBinaryData (Sfs2X.Util.ByteArray ba) [0x00000]
  at Sfs2X.Core.SFSProtocolCodec.OnPacketRead (Sfs2X.Util.ByteArray packet) [0x00000]
  at Sfs2X.Core.SFSIOHandler.HandlePacketData (Sfs2X.Util.ByteArray data) [0x00000]
  at Sfs2X.Core.SFSIOHandler.OnDataRead (Sfs2X.Util.ByteArray data) [0x00000]
  at Sfs2X.Bitswarm.BitSwarmClient.OnSocketData (System.Byte[] data) [0x00000]
  at Sfs2X.Core.Sockets.TCPSocketLayer.CallOnData (System.Byte[] data) [0x00000]
  at Sfs2X.Core.Sockets.TCPSocketLayer.HandleBinaryData (System.Byte[] buf, Int32 size) [0x00000]
  at Sfs2X.Core.Sockets.TCPSocketLayer.Read () [0x00000]


The server still keeps the socket connection for a while though and continuously pumps data into the void until the idle kicks in.

The question is: is this suppose to be the expected behavior of the client to unilaterally drop the connection after the first corrupted packet?

And if so, is there any way to throttle the server then to keep this occurrence to a minimum.
tchen
Posts: 191
Joined: 11 Dec 2010, 14:14

Postby tchen » 09 Feb 2011, 16:30

Throwing up the test client code for review

Code: Select all

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Sfs2X;
using Sfs2X.Core;
using Sfs2X.Entities;
using Sfs2X.Entities.Data;
using Sfs2X.Requests;
using Sfs2X.Logging;

namespace ChatterBox
{
    class Program
    {
        static void Main(string[] args)
        {
            List<Program> pool = new List<Program>();
            for (int i = 0; i < clientCount; i++)
            {
                Program app = new Program();
                pool.Add(app);
            }

            while (true)
            {
                foreach (Program app in pool)
                {
                    if (app.m_bStarted)
                    {
                        try
                        {
                            app.Update();
                        }
                        catch(Exception)
                        {
                            if (!app.m_sfs.IsConnected)
                            {
                                Console.WriteLine("Connection terminated");
                            }
                        }
                    }
                    else
                    {
                        if (startupQueue.Count < simultaneousConnect)
                        {
                            startupQueue.Add(app);
                            app.Connect();
                        }
                    }
                }
            }
        }

        // TEST SETTINGS

        static int simultaneousConnect = 5;
        static int clientCount = 75;
        static List<Program> startupQueue = new List<Program>();
        static Random rnd = new Random();
           
        bool m_bStarted = false;
        SmartFox m_sfs;
        Room m_room;
        DateTime m_nextMsgTime = DateTime.Now;
        string m_host = "xxx.xxx.xxx.xxx";
        bool m_silent = true;
        int m_minDelay = 200;
        int m_maxDelay = 500;       // ms
        int m_payloadSize = 111;    // bytes

        public Program()
        {
            m_sfs = new SmartFox();
            m_sfs.AddEventListener(SFSEvent.CONNECTION, OnConnection);
            m_sfs.AddEventListener(SFSEvent.CONNECTION_LOST, OnConnectionLost);
            m_sfs.AddEventListener(SFSEvent.LOGIN, OnLogin);
            m_sfs.AddEventListener(SFSEvent.LOGIN_ERROR, OnLoginError);
            m_sfs.AddEventListener(SFSEvent.LOGOUT, OnLogout);
            m_sfs.AddEventListener(SFSEvent.ROOM_JOIN, OnRoomJoin);
            m_sfs.AddEventListener(SFSEvent.ROOM_REMOVE, OnRoomRemove);
            m_sfs.AddEventListener(SFSEvent.PUBLIC_MESSAGE, OnPublicMessage);
        }

        public void Connect()
        {
            m_sfs.Connect(m_host, 9933);
            m_bStarted = true;
        }

        private string randomString(int size, bool lowerCase)
        {
            StringBuilder builder = new StringBuilder();
            Random random = new Random();
            char ch;
            for (int i = 0; i < size; i++)
            {
                ch = Convert.ToChar(Convert.ToInt32(Math.Floor(26 * random.NextDouble() + 65)));
                builder.Append(ch);
            }
            if (lowerCase)
                return builder.ToString().ToLower();
            return builder.ToString();
        }

        public void Update()
        {
            if (!m_sfs.IsConnected)
            {
                return;
            }

            m_sfs.ProcessEvents();

            // start spamming
            if (m_room != null)
            {
                if (DateTime.Now > m_nextMsgTime)
                {
                    // create a payload
                    string msg = randomString(m_payloadSize, false);
                    m_sfs.Send(new PublicMessageRequest(msg));
                    m_nextMsgTime = DateTime.Now.AddMilliseconds(rnd.Next(m_minDelay,m_maxDelay));
                }
            }
        }

        void OnConnection(BaseEvent evt)
        {
            startupQueue.Remove(this);

            bool success = (bool)evt.Params["success"];

            if (success)
            {
                Console.WriteLine("Connected");
                string user = string.Format("bubba.{0}", rnd.Next());
                string pass = "";
                m_sfs.Send(new LoginRequest(user, pass, "SimpleChat"));
            }
        }

        void OnConnectionLost(BaseEvent evt)
        {
            Console.WriteLine("Connection lost");
        }

        void OnLogin(BaseEvent evt)
        {
            bool success = true;
            if (evt.Params.ContainsKey("success") && !(bool)evt.Params["success"])
            {
                success = false;
            }
            Console.WriteLine("Login:" + success.ToString());

            if (success)
            {
                // join room
                Room chatter = m_sfs.GetRoomByName("The Lobby");
                m_sfs.Send(new JoinRoomRequest(chatter.Id));
            }
            else
            {
                m_sfs.Disconnect();
            }
        }

        void OnLoginError(BaseEvent evt)
        {
            Console.WriteLine("Login error:" + (string)evt.Params["errorMessage"]);
        }

        void OnLogout(BaseEvent evt)
        {
            Console.WriteLine("Logout complete");
        }


        void OnRoomJoin(BaseEvent evt)
        {
            Console.WriteLine("Joined room");
            m_room = (Room)evt.Params["room"];
            m_nextMsgTime = DateTime.Now;
        }

        void OnRoomRemove(BaseEvent evt)
        {
            Console.WriteLine("Left room");
        }

        void OnPublicMessage(BaseEvent evt)
        {
            if (!m_silent)
            {
                User sender = (User)evt.Params["sender"];
                Console.WriteLine(sender.Name + ": " + evt.Params["message"]);
            }
        }
    }
}

ThomasLund
Posts: 1297
Joined: 14 Mar 2008, 07:52
Location: Sweden

Postby ThomasLund » 09 Feb 2011, 20:08

The client has recently had an update (still in SVN), so it tries to continue even after bad data. Is/was bad behaviour to simply drop out when it meets bad data of some sort.

Now since it can continue operations, it also means that your client can be out of sync. It doesnt try to repair data - just tries to continue without dying

/Thomas
tchen
Posts: 191
Joined: 11 Dec 2010, 14:14

Postby tchen » 09 Feb 2011, 21:19

Thanks Thomas for the update on the situation. Would there be some sort of event callback we can hook into as well so we can initiate our resync and validation code on the client - that would help a lot.

Cheers,
Ted
ThomasLund
Posts: 1297
Joined: 14 Mar 2008, 07:52
Location: Sweden

Postby ThomasLund » 10 Feb 2011, 06:53

Put on the idea list - thanks!

Primary problem (for you - even with callback) would be to know what exactly was missed. It could be a chat message - no big harm. Or could be user count message or room list updates. Or could be essential extension responses that you simply have to have (payment receipt? login?)

Hard to know how to recover. But at least a callback could put a message up in front of the user.

/Thomas
rparker
Posts: 19
Joined: 18 Oct 2010, 09:10

Postby rparker » 25 May 2011, 23:37

We've recently run into this issue. What is the current state of the client dll with regards to this error?
cnPauly
Posts: 13
Joined: 08 Apr 2009, 18:32

Postby cnPauly » 07 Jun 2011, 17:32

rparker wrote:We've recently run into this issue. What is the current state of the client dll with regards to this error?


bump
User avatar
Lapo
Site Admin
Posts: 23008
Joined: 21 Mar 2005, 09:50
Location: Italy

Postby Lapo » 08 Jun 2011, 09:24

It is correct.
If you get to the point where you have a socket error the client won't be able to recover. Because he cannot understand from the stream of bytes where a new message is starting.
The error should [url]never[/url] happen. Even under the most horrible stress test you should get packet loss, but no data corruption.

I've been testing mostly AS3 clients with up to 8K messages per second but I also did a few tests in C# with 10K messages/sec and never seen a problem.
Can you provide a reproducible test case?

I am assuming that you are using SFS2X RC2 with the latest API, right?
Lapo
--
gotoAndPlay()
...addicted to flash games
tchen
Posts: 191
Joined: 11 Dec 2010, 14:14

Postby tchen » 09 Jun 2011, 11:30

The original test was against RC1. Server was running the default chat while two basic Mono clients running the provided code was bombarding it.

I haven't had the time to set it up again with RC2, so if someone could verify that the .Net client either has the packet-drop event now or some other notification, it would be appreciated.

Thanks.
User avatar
Lapo
Site Admin
Posts: 23008
Joined: 21 Mar 2005, 09:50
Location: Italy

Postby Lapo » 10 Jun 2011, 11:48

You mean a "packet drop" notification on the client?
I am sorry but this wouldn't be really possible. If the server is already struggling to send data to the client and the client queue is 100% full there's no way we can send some other kind of notifications.
If we could instead of sending it we would be sending the actual packet that is being dropped, right? ;)
Lapo

--

gotoAndPlay()

...addicted to flash games
tchen
Posts: 191
Joined: 11 Dec 2010, 14:14

Postby tchen » 12 Jun 2011, 20:56

I didn't mean sending any notification from the server to the client or vice versa, as obviously something is wrong with the socket. But more of an event signal on either side instead of us trying to trap the exceptions - which unfortunately, when they bubble up, only contains information regarding the socket, and not the wrapping users/server objects.
User avatar
Lapo
Site Admin
Posts: 23008
Joined: 21 Mar 2005, 09:50
Location: Italy

Postby Lapo » 13 Jun 2011, 12:19

It is still not clear what you would like to accomplish exactly.
Packets can be dropped in hundreds when clients can't keep up or are experiencing a momentary slow down.
Firing events on every lost packet would overload the extension subsystem with potentially tons of events.
Even if we introduce a "throttling" system I still don't see what would come next, after you get the notification.
If you want to disconnect the user let the system do it for you by configuring the "sensitivity" at which users are disconnected.

Also try improving the performance by configuring the Room Events: e.g. suppress the UserCount updates or slow them down significantly with the new throttling mechanism etc...
Lapo

--

gotoAndPlay()

...addicted to flash games
tchen
Posts: 191
Joined: 11 Dec 2010, 14:14

Postby tchen » 19 Jun 2011, 10:51

Sorry if the thread was TL;DR. You can ignore the request to throttle the system as our convo with ThomasLund settled on what he was implementing - the client not throwing just a deep-level exception when the reader encounted bad bytes or misalignment from packet loss.

Obviously, it's not going to recover, but at the very least, if the SFS client recognized that there was a bad packet, caught and signaled an event with something useful (eg server id) then we can at least initiate a reconnect on our side of the API.

Return to “SFS2X Questions”

Who is online

Users browsing this forum: No registered users and 46 guests