Skip to content

[2.1] Bots - reduce session writes#9126

Open
sbulen wants to merge 3 commits intoSimpleMachines:release-2.1from
sbulen:21_bot_sessions
Open

[2.1] Bots - reduce session writes#9126
sbulen wants to merge 3 commits intoSimpleMachines:release-2.1from
sbulen:21_bot_sessions

Conversation

@sbulen
Copy link
Contributor

@sbulen sbulen commented Feb 25, 2026

This PR aims to minimize I/O during a bot attack by eliminating unnecessary duplicative session writes. Session writes spike during botnet attacks, and the database management, e.g., the undo/redo logs, can get backed up with the extremely high volume of updates.

As it is today, at least one session update occurs any time index.php is invoked. Multiple session writes can occur, e.g., during redirects, attachment loads, notification checks, verifications, keepalives, and any time a bot passes a bogus session via URL or cookie.

The approach used here is to simply not write sessions during normal browsing activity, e.g., boards, topics, & messages. Sessions will be written whenever an action is taken; this avoids any loss of functionality dependent on the session.

Note that some of these bot attacks are what I call 'long tail' attacks, in which many IPs are used once and only once. This was a trademark of the 2025 Q3 Brazil/Vietnam attack; on some days, I would have 150,000+ IPs with only one topic or message request each... The problem being that each would get its own unique session & log_online record, causing a flood of writes & log activity. And then, of course, they would all need to be deleted during garbage collection.

With this change, they wouldn't be written at all.

One of the reasons so many sessions are written is that they are timestamped by SMF during the actual write, forcing each session write to be unique. My understanding is that this thwarts PHP's normal session update checks, as they would normally only be written when real updates occur. That may be an avenue for further testing & savings. At the moment, I'm not sure what the impacts would be of removing that particular timestamp.

I've been running this code on my prod forum with no issues.

If this is approved, I can submit a 3.0 version.

For more discussion see:
https://www.simplemachines.org/community/index.php?topic=592442.0
https://www.simplemachines.org/community/index.php?topic=592345.0

Feedback welcome.

Signed-off-by: Shawn Bulen <bulens@pacbell.net>
@sbulen
Copy link
Contributor Author

sbulen commented Feb 25, 2026

Of course the session itself is a cache... One thing I see a lot more of after implementing this is rebuilding the ban cache, & the corresponding queries for a ban by IP.

For guests, a session write could prevent multiple reads to rebuild the ban cache, and this would be helpful IF we were seeing a lot of requests from the same IPs. But we don't know in advance if that session info is going to be used. And we've definitely seen massive issues with db logs during the 'long tail' type of attack, where the majority of sessions weren't reused. I haven't seen that during other attacks. So... I leaned towards the far more restrictive session writes; it avoids the biggest issues we've seen. And what was in those session logs? For the most part, many copies of the same thing, over & over, with only timestamp & IP differences.

Logged on users aren't affected at all, as logging on is an action & those get cached. The questions are all around guests/bots, & how to minimize impacts of that activity.

@jdarwood007 jdarwood007 added this to the 2.1.8 milestone Mar 8, 2026
@jdarwood007
Copy link
Member

@sbulen Do you have the related updates for 3.0 on this PR?

@sbulen
Copy link
Contributor Author

sbulen commented Mar 8, 2026

I was going to add 3.0 versions once the 2.1 ones were approved. Trying to avoid duplication of rework, etc.

This one is probably the highest risk, & requires the most scrutiny.

I haven't seen issues at all in my testing. In fact, I have this on my prod board. But it's always possible I missed something, some impact to some weird action I don't use, etc.

@sbulen
Copy link
Contributor Author

sbulen commented Mar 10, 2026

Interesting... Since upgrading to 2.1.7, certain bot activity is now triggering errors in the error log. I believe due to this particular tweak (which I have installed via a modlet).

The errors I am seeing:

  • Some (not all) "view likes" bot requests, with invalid session vars passed via URL, are generating "Unable to verify referring URL" errors. Kinda makes sense, as it does look at the session for the old_URL in the session check. But... Why would this have never been triggered before? The invalid sessions have always been ignored/destroyed & started over.
  • Some activity (I think hack attempts, sql injection) is generating "Wrong value type sent to the database. IPv4 or IPv6 expected.(ip) Function: is_not_banned".

Not sure why I am only seeing this after 2.1.7. I had this installed as a modlet (as well as one for the PHPSESSID change) in 2.1.6. So in theory, nothing changed wrt sessions...

Still researching. First step is to deinstall this modlet & see if the errors go away. And definitively line up log entries with the errors; and get to where I can reproduce at will. Not there yet. Still in the chasing-red-herrings phase...

EDIT: Both of these errors have been reported in the forum before by others. So possibly completely unrelated to any of the session tweaks. But still feels like a bit much of a coincidence that I am only seeing them now...

@sbulen
Copy link
Contributor Author

sbulen commented Mar 10, 2026

OK the referral issue has nothing to do with this PR. I can reproduce the error on vanilla 2.1.7 & 2.1.6.

I see in my web logs that these are bots that are attempting to make it look like they're coming from google, by spoofing the referrer.

To reproduce:

  1. Enable likes
  2. Spoof referrer as https://www.google.com/, I just do something like this early in index.php:
    $_SERVER['HTTP_REFERER'] = 'https://www.google.com/';
  3. As a guest, make a view likes request... It's important to use a valid msg ID after 'like=' here:
    https://yourtestforum/smf/index.php?action=likes;sa=view;ltype=msg;like=1;fd1f528f7b=65637f9622a746062d68df94efab633a

Under most circumstances, the spoofed referrer is detected, but no error is logged - it just kicks the guest back to the board index. (I am assuming SMF smartly didn't want to make it that easy for a bot to flood the error log...)

But a valid-looking view likes request, with a valid msg ID & a bogus session, logs the error. And that's what I'm seeing on my forum.

So this one was, in fact, a red herring.

Valid question, though, whether we want even these logged... We're still in a situation where a bot can flood the error log with invalid requests. As noted earlier, this has been reported in the forum:
https://www.simplemachines.org/community/index.php?topic=593808.0

@sbulen
Copy link
Contributor Author

sbulen commented Mar 10, 2026

Notes on the 'wrong value type' error...

  • Definitely bad guys. They are asking for the same topic over & over. A couple hours later, similar probes, from different IPs.
  • With each GET, they are systematically probing via various escape strings, e.g., in the request, the referrer, the user agent, etc. Some of this is visible in the logs, some isn't, e.g., the access log doesn't show all headers & cookie info.
  • Most of those are kicked back OK with a 200; the topic is displayed & bogus data is just ignored.
  • For some of these, though, the IP is getting stepped on; I suspect they're mucking with the cookie or other headers. The ban check fails, & SMF attempts to report the error.
  • When this happens, the query fails with a 500, though, which might be an SMF issue. Needs more testing, but in PHP8.4, which I'm on, trigger_error() passing E_USER_ERROR has been deprecated. And that is what is passed to the backtrace. I think that's why there is an accompanying 500.
  • But that's after the actual error, the bogus IP in the ban check query. I'm still not sure how the IP is getting stepped on. It's present in the web access log, but not in the SMF error log, and is the cause of the broken query.

Still researching. Thinking aloud while doing so...

@sbulen
Copy link
Contributor Author

sbulen commented Mar 11, 2026

I've been able to reproduce the 'wrong value type' error in 2.1.7 without this PR.

So neither of the issues I'm seeing are tied to this PR. Interesting.

Good lessons though; I think a couple tweaks are needed to reduce errors in the log.

@sbulen
Copy link
Contributor Author

sbulen commented Mar 11, 2026

Logged some issues based on the lessons & notes above:
#9141 - E_USER_ERROR usage deprecated in trigger_error() in PHP 8.4
#9142 - "Unable to verify referring URL" errors can flood the SMF error log
#9143 - "Wrong value type sent to the database. IPv4 or IPv6 expected."

@live627
Copy link
Contributor

live627 commented Mar 12, 2026

I haven't tested this but guest posting with verification via quick reply might not work.

@sbulen
Copy link
Contributor Author

sbulen commented Mar 12, 2026

I haven't tested this but guest posting with verification via quick reply might not work.

Tests OK. Anything with an action= proceeds as normal.

GREAT CATCH... I had been testing with Reply, not Quick Reply... And there is no action= with QuickReply...

This will require some thought.

I tried multilple browsers, some appeared to work. But it varies between failing outright or just being flaky.

It's a bug.

@sbulen
Copy link
Contributor Author

sbulen commented Mar 13, 2026

@live627 - In a similar vein, even logged in user post verification, when using quick reply, if they do not yet meet the post count to bypass verification.

Under the covers, the same thing: No action, but verification required, so the verification code needs to be in the session.

Hmmm...

Signed-off-by: Shawn Bulen <bulens@pacbell.net>
@sbulen
Copy link
Contributor Author

sbulen commented Mar 13, 2026

I believe the latest commit addresses the issue @live627 raised.

Good input.

Signed-off-by: Shawn Bulen <bulens@pacbell.net>
@sbulen
Copy link
Contributor Author

sbulen commented Mar 13, 2026

I think the solution to that last issue, checking $context['require_verification'], works ok in practice.

But it feels like it deserves more structured support than querying $context, or even the raw settings & user post count...

I wonder if require_verification should be a 3.0 User:: or 2.1 $user_info attribute... Probably needs to be split into require_search_verif & require_post_verif, etc, and set upon load based on settings and known user info. Just thinking aloud...

Copy link
Member

@jdarwood007 jdarwood007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, will merge later with the 3.0 version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants