OPNsense LAN Instability, AP/OPT Segmentation Review, and TL-SG108E Storm-Control Stabilization¶

Summary¶

This session focused on troubleshooting intermittent LAN and internet connectivity problems in a homelab network built around OPNsense, multiple TP-Link TL-SG108E smart switches, and TP-Link Archer access points. The original investigation began around a question of whether an OPT interface and a second downstream router were causing client internet failures. During troubleshooting, it was clarified that one downstream TP-Link device was already operating in access point mode rather than router mode, shifting attention back to the main LAN path.

The issue was ultimately narrowed to instability on the flat LAN segment rather than OPT1. The network was simplified conceptually as a flat switched network, and switch storm-control settings were applied uniformly. A flat storm-control threshold of 8000 Kbps on all switches appeared to stabilize the environment.

Environment¶

Router/firewall: OPNsense running on a CWWK 12th-gen Intel mini PC with Intel i3-N305
Primary LAN switching:
4 × TP-Link TL-SG108E Easy Smart switches
Wireless infrastructure:
TP-Link Archer AX80 in AP mode on LAN
TP-Link Archer AX72 Pro in AP mode on OPT1
Wired clients mentioned:
Proxmox nodes
Android TV box
OPNsense interfaces:
LAN
OPT1
Logical network design during this session:
Flat LAN on the main switched path
Separate OPT1 segment feeding the Archer AX72 Pro in AP mode
Services discussed:
OPNsense DHCP
OPNsense Unbound DNS
Firewall rules and outbound NAT
Switch storm control
Loop prevention
VLANs:
Discussed conceptually only
Not implemented during this session

Problem¶

Devices on the main LAN were slow to connect, sometimes failed to get internet access after connecting, and wired devices downstream of the smart switches were also losing connectivity. There was initial uncertainty about whether the issue was related to OPT1 segmentation, downstream TP-Link operating mode, DHCP behavior, or broader layer-2 instability.

Symptoms¶

Devices were slow to connect to the network
Once connected, devices often had no internet access
Wired devices connected through the smart-switch chain also lost connectivity
OPNsense logs showed some default deny and state-violation entries
There was uncertainty about whether a second downstream TP-Link was acting as a router or only as an AP
Connectivity problems affected both wireless and wired clients on the main LAN path

Actions Taken¶

Reviewed the original physical topology:
OPNsense LAN → TL-SG108E core switch → additional TL-SG108E switches → Proxmox nodes, Android TV box, and Archer AX80 in AP mode
OPNsense OPT1 → Archer AX72 Pro in AP mode
Considered whether the downstream TP-Link on the secondary path should operate as:
a router behind an OPNsense OPT interface, or
a pure AP bridged into the OPNsense segment
Clarified that the Archer on the secondary segment was already in AP mode, not router mode.
Determined that OPT1 appeared to be functioning correctly and shifted focus to the main LAN path.
Reviewed likely causes on LAN:
layer-2 loops
broadcast or multicast storms
rogue DHCP
AP uplink misconfiguration
switch-chain instability
Verified or discussed AP best practices:
AP mode enabled
DHCP disabled on APs
uplink should use LAN port rather than WAN in AP mode
management IPs should be static or reserved
Discussed assigning static IP addresses to the Archer APs for management.
Reviewed TL-SG108E management considerations:
password-reset path if login access was lost
later confirmed switch login was still using the default credentials
Reviewed TL-SG108E VLAN features conceptually:
MTU VLAN
Port-Based VLAN
802.1Q VLAN
PVID behavior
Confirmed VLANs were not needed at this time
Confirmed that loop prevention was already enabled by default on the switches.
Applied storm-control settings uniformly across the switches at a flat threshold of 8000 Kbps to simplify the configuration.
Observed that the storm-control change appeared to resolve the immediate LAN instability.

Key Findings¶

The problem path was the main LAN, not OPT1.
The secondary TP-Link device was already in AP mode, so the issue was not caused by an intended router-behind-OPT design.
Because the AP was bridged, clients behind it should have been using OPNsense-provided DHCP, gateway, and DNS rather than a separate routed path.
The symptoms matched a likely layer-2 issue more closely than a routing issue:
intermittent connectivity
slow association
wired and wireless impact
apparent stabilization after storm-control changes
Loop prevention was already enabled, but that alone was not sufficient to fully suppress the instability.
Applying storm control at 8000 Kbps on all switches appeared to mitigate the problem.
VLAN support on the TL-SG108E and OPNsense was discussed for future use, but VLANs were not part of the implemented fix.

Facts¶

OPT1 appeared stable during this session.
Archer AX72 Pro was confirmed to be in AP mode.
Archer AX80 was also being used in AP mode on LAN.
There were actually 4 TL-SG108E switches in the environment, not 3.
Loop prevention was enabled by default.
Storm control set to 8000 Kbps across the switches appeared to solve the issue for the time being.

Assumptions / Working Theories¶

The root cause was likely excessive broadcast, multicast, unknown unicast, or other layer-2 flood behavior on the flat LAN.
A loop, bursty discovery traffic, or MAC-table churn may have contributed even if loop prevention was enabled.
Rogue DHCP remained a theoretical possibility earlier in the investigation, but no direct evidence in this conversation confirmed it as the final cause.

Resolution¶

The practical workaround and current working fix was to keep the network flat and enable switch storm control uniformly at 8000 Kbps across the TL-SG108E switches. This simplified the configuration and appears to have stabilized LAN behavior.

No VLAN design was deployed during this work session. OPT1 remained separate from LAN, but the main issue was treated as a LAN switching problem rather than a firewall or routed-interface problem.

Validation¶

Success was validated informally by observed behavior after the switch change: - LAN connectivity appeared stable - Internet access returned for affected devices - The user reported that storm control seemed to have solved the issue - A flat 8000 Kbps threshold was retained because it worked and was easy to manage

Follow-Up Tasks¶

Change the default admin password on all TL-SG108E switches
Assign static management IPs to all TL-SG108E switches
Assign static management IPs or DHCP reservations to both Archer APs
Document exact switch port mappings:
core switch uplink to OPNsense
inter-switch uplinks
Proxmox-node ports
AP ports
Android TV box port
Back up switch configurations after confirming stability
Continue monitoring for:
renewed packet loss
discovery failures
intermittent wired drops
multicast-heavy application issues
Consider raising multicast or unknown-unicast storm thresholds on switch uplinks later if application-specific issues appear
Verify that only OPNsense is providing DHCP on the LAN segment
Disable EEE/Green Ethernet on uplinks, AP ports, and server ports if not already done
Keep VLAN planning deferred until the flat network remains stable over time

Lessons Learned¶

Do not assume a downstream TP-Link is routing; confirm whether it is in AP mode or router mode before designing around NAT or firewall behavior.
If both wired and wireless clients are unstable on the same flat LAN, investigate layer-2 conditions before focusing on routing.
Loop prevention alone may not be enough to stabilize a noisy switched environment.
A simple, flat storm-control policy can be a useful first stabilization step in a small homelab.
Keep management IPs for APs and switches fixed and documented.
Avoid adding VLAN complexity until the physical topology and baseline switching behavior are stable.

Command Reference¶

Command¶

ipconfig

What it does¶

Displays IP configuration on Windows clients.

Why it was relevant¶

Used or implied for checking whether a client received the correct IP address, default gateway, and DNS server.

Expected result¶

A client on the LAN should receive: - an IP in the LAN subnet - default gateway equal to the OPNsense LAN IP - DNS pointing to OPNsense or the intended DNS path

What success or failure indicates¶

Success: client is likely getting correct DHCP information
Failure: wrong gateway or DNS may indicate rogue DHCP, AP/router misconfiguration, or subnet mismatch

Notes¶

Low risk.

Command¶

ifconfig

What it does¶

Displays interface configuration on Unix-like systems.

Why it was relevant¶

Implied as the Linux/macOS equivalent of ipconfig for confirming client IP settings.

Expected result¶

The client interface should show the correct subnet, address, and routing context.

What success or failure indicates¶

Correct interface details support DHCP and addressing health
Incorrect subnet or no address points toward DHCP or physical connectivity issues

Notes¶

Low risk.

Command¶

ping 8.8.8.8

What it does¶

Tests raw IP connectivity without depending on DNS.

Why it was relevant¶

Used conceptually to distinguish routing/internet problems from DNS problems.

Expected result¶

Replies from 8.8.8.8 if the client has working connectivity to the internet.

What success or failure indicates¶

Success: routing and outbound connectivity are likely working
Failure: internet routing, firewall, NAT, or upstream path may be broken

Notes¶

Low risk.
A more policy-neutral test target in some environments may be the ISP gateway or another known reachable IP.

Command¶

ping example.com

What it does¶

Tests both DNS resolution and connectivity.

Why it was relevant¶

Used conceptually to determine whether hostname resolution was working after a raw IP ping test.

Expected result¶

The hostname should resolve and the destination should reply.

What success or failure indicates¶

If ping 8.8.8.8 works but this fails, DNS is the likely problem
If both fail, the issue is probably broader than DNS

Notes¶

Low risk.

Command¶

nslookup example.com

What it does¶

Queries DNS directly for a hostname.

Why it was relevant¶

Implied for validating whether OPNsense Unbound DNS was reachable and resolving names correctly.

Expected result¶

The command should return a valid DNS response from the intended resolver.

What success or failure indicates¶

Success: DNS service path is functioning
Failure: resolver access, Unbound interface binding, access lists, or upstream resolution may be broken

Notes¶

Low risk.

Command¶

traceroute 8.8.8.8

What it does¶

Shows the path packets take toward a destination on Unix-like systems.

Why it was relevant¶

Implied for checking whether traffic was traversing the expected local gateway path.

Expected result¶

The first hop should be the local router for that segment, followed by upstream path entries.

What success or failure indicates¶

Correct first hop confirms the expected default gateway
Unexpected first hops can reveal hidden NAT, wrong gateway assignment, or routing asymmetry

Notes¶

Low risk.

Safer / platform note¶

On Windows, the equivalent is:

tracert 8.8.8.8

Command¶

tracert 8.8.8.8

What it does¶

Windows equivalent of traceroute.

Why it was relevant¶

Suggested conceptually to verify that clients were using the correct gateway and path.

Expected result¶

The first hop should be the local router for the client subnet.

What success or failure indicates¶

Correct path supports proper DHCP and routing
Incorrect path suggests gateway or topology problems

Notes¶

Low risk.

Likely command used¶

arp -a

What it does¶

Displays the ARP cache on many client systems.

Why it was relevant¶

A likely troubleshooting step for checking IP-to-MAC relationships and detecting duplicate IP behavior or unexpected gateways.

Expected result¶

The gateway IP should map to the expected OPNsense MAC address.

What success or failure indicates¶

Expected mapping supports correct layer-2 forwarding
Flapping or unexpected MACs can suggest duplicate addressing or rogue infrastructure

Notes¶

Low risk.

OPNsense action¶

Diagnostics → States → Reset States

What it does¶

Clears the firewall state table in OPNsense.

Why it was relevant¶

Suggested after topology changes or interface-path changes to remove stale states that can produce state-violation logs or inconsistent connectivity.

Expected result¶

Existing connections are briefly interrupted, then rebuild using current topology and policy.

What success or failure indicates¶

Improvement afterward suggests stale states contributed to the issue
No improvement suggests the root cause is elsewhere

Notes¶

Moderate operational impact.
This is disruptive to active sessions and should be used carefully during production traffic.

OPNsense action¶

Services → DHCPv4 → Leases

What it does¶

Displays active DHCP leases on an interface.

Why it was relevant¶

Used to verify that clients were receiving addresses from OPNsense rather than from a rogue DHCP server.

Expected result¶

Affected clients should appear with addresses in the correct subnet.

What success or failure indicates¶

Expected leases support correct DHCP behavior
Missing or inconsistent leases may indicate DHCP conflict or wrong segment placement

Notes¶

Low risk.
Read-only diagnostic view.

OPNsense action¶

Services → Unbound DNS → General

What it does¶

Controls Unbound DNS listener interfaces and resolver behavior.

Why it was relevant¶

Discussed in detail because DNS issues can look like internet failures even when routing works.

Expected result¶

Unbound should listen on the intended internal interfaces and permit the intended client subnets.

What success or failure indicates¶

Correct configuration allows reliable name resolution
Misconfiguration can cause clients to appear “online but without internet”

Notes¶

Low risk when reviewing.
Changing settings may affect DNS for multiple subnets.

OPNsense action¶

Firewall → Rules → LAN

What it does¶

Shows and manages firewall rules for the LAN interface.

Why it was relevant¶

A standard check when determining whether the problem is routing/firewall-related or truly layer-2-related.

Expected result¶

A permissive LAN rule set should allow normal outbound traffic during basic troubleshooting.

What success or failure indicates¶

If rules are correct and the issue persists, the cause is more likely switching, DHCP, or DNS
Misordered or missing rules can block traffic in ways that mimic network instability

Notes¶

Moderate risk if modified.
Firewall changes affect reachability immediately.

OPNsense action¶

Firewall → NAT → Outbound

What it does¶

Controls outbound NAT policy.

Why it was relevant¶

Discussed when evaluating whether OPT-style routed networks were missing internet due to NAT configuration.

Expected result¶

Internal subnets requiring internet access should be translated correctly on WAN.

What success or failure indicates¶

Correct NAT allows outbound internet access
Missing NAT on a routed subnet breaks internet access even when local connectivity works

Notes¶

Moderate risk if changed.
Incorrect outbound NAT can disrupt all egress traffic.

OPNsense action¶

Interfaces → Assignments / Interfaces → OPT1

What it does¶

Assigns and configures routed interfaces such as OPT1.

Why it was relevant¶

Used conceptually when determining whether a second router or AP on OPT1 should be routed or bridged.

Expected result¶

OPT1 should have the intended subnet and service scope if used as a separate network.

What success or failure indicates¶

A healthy OPT1 with working clients suggests the main issue is elsewhere
Misconfiguration would isolate clients on that segment

Notes¶

Moderate risk if changed.
Interface changes can interrupt connected devices.

Switch action¶

QoS → Storm Control

What it does¶

Applies per-port thresholds to suppress excessive broadcast, multicast, or unknown-unicast traffic on the TL-SG108E switches.

Why it was relevant¶

This became the key stabilization step in the session.

Expected result¶

Flood behavior should be limited enough to prevent LAN instability while still allowing normal traffic.

What success or failure indicates¶

Improvement after enabling or adjusting storm control suggests a layer-2 flood condition was contributing
Overly aggressive thresholds may suppress legitimate traffic

Notes¶

Moderate operational risk.
Improper thresholds can interfere with legitimate traffic such as discovery or multicast-heavy applications.

Switch action¶

Loop Prevention / Loopback Detection

What it does¶

Attempts to detect and suppress switching loops on the TL-SG108E platform.

Why it was relevant¶

Reviewed because a loop or loop-like condition was one of the primary suspected causes of the LAN instability.

Expected result¶

The switch should detect and mitigate loop conditions automatically.

What success or failure indicates¶

If enabled but instability persists, additional controls such as storm control may still be required
A disabled state would increase exposure to accidental loops

Notes¶

Low risk to enable.
Important baseline protection in daisy-chained homelab switch topologies.

Switch action¶

Advanced → Network → LAN

What it does¶

Used on the Archer APs to set or review the management IP address.

Why it was relevant¶

Static IP assignment for AP management was part of the cleanup and hardening discussion.

Expected result¶

Each AP receives a stable management address in the correct subnet and outside the DHCP pool.

What success or failure indicates¶

Correct IPs make AP administration predictable
Wrong subnet or overlapping DHCP use can cause management-plane confusion

Notes¶

Low to moderate risk.
Changing the management IP can briefly disconnect the web session until reconnecting to the new address.

Switch / AP action¶

Factory reset via recessed reset button

What it does¶

Restores the switch or AP to factory defaults.

Why it was relevant¶

Discussed as a recovery method for TL-SG108E login access before it was discovered that the switches were still using the default credentials.

Expected result¶

Default management access and default credentials are restored.

What success or failure indicates¶

Successful reset restores admin access
It also removes custom configuration, requiring reconfiguration afterward

Notes¶

High operational risk.
Use only when necessary and after recording current settings if possible.