FusionAuth
    • Home
    • Categories
    • Recent
    • Popular
    • Pricing
    • Contact us
    • Docs
    • Login

    FusionAuth keeps restarting after upgrade from version 1.36.6 -> 1.38.1 even after successful migrations

    Scheduled Pinned Locked Moved Unsolved
    Q&A
    2
    6
    1.1k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P
      prithwhat
      last edited by

      So my application usually starts up with no issues at first, but when I use the admin console of the app, I notice performance issues which eventually leads to the deployment crashing and restarting - along with this it uses up the CPU of the Postgres instance

      The application has been deployed via K8s and these are the configurations/resources given to it

          resources:
            limits:
             cpu: 1000m
             memory: 3Gi
            requests:
             cpu: 500m
             memory: 1Gi
      

      The app's memory and cpu does spike it but it always stays in limit even before the time of restart/crash

      I initially thought that issue must be due to an incomplete migration in 1.37 run through the silentMode but even after running the migration manually the issue still persists

      In my container logs I can see

      2025-03-04 11:50:26.118 AM INFO com.inversoft.jdbc.hikari.DataSourceProvider - Connecting to PostgreSQL database at [jdbc:postgresql://{postgres_db_url}]

      2025-03-04 11:50:26.120 AM WARN com.zaxxer.hikari.HikariConfig - HikariPool-1 - idleTimeout has been set but has no effect because the pool is operating as a fixed size pool.2025-03-04 11:50:26.122 AM INFO com.zaxxer.hikari.HikariDataSource - HikariPool-1 - Starting...

      2025-03-04 11:50:26.851 AM INFO com.zaxxer.hikari.pool.HikariPool - HikariPool-1 - Added connection org.postgresql.jdbc.PgConnection@557b6a37

      2025-03-04 11:50:38.245 AM INFO io.fusionauth.api.service.system.NodeService - Node [27af95a4-74ca-4d4f-92dc-a06306e7a597] promoted to master at [2025-03-04T11:50:38.245353677Z], the previous master Node [80222721-88f3-4b24-85f4-3af56b57732d] has been shutdown or removed

      2025-03-04 11:50:39.164 AM INFO io.fusionauth.app.primeframework.FusionHTTPContextAuthSetup - Initializing the FusionAuth HTTP Context.

      2025-03-04 11:50:39.428 AM INFO org.primeframework.mvc.netty.PrimeHTTPServer - Starting FusionAuth HTTP server on port [9011]

      2025-03-04 11:50:39.748 AM INFO org.primeframework.mvc.netty.PrimeHTTPServer - Starting FusionAuth HTTP loopback server on port [9012]

      2025-03-04 11:57:51.749 AM INFO org.primeframework.mvc.netty.PrimeHTTPServer - Shutting down the Prime HTTP server [/0.0.0.0:9011]

      2025-03-04 11:57:51.750 AM INFO org.primeframework.mvc.netty.PrimeHTTPServer - Shutting down the Prime HTTP server [/0.0.0.0:9012]

      2025-03-04 11:57:51.750 AM INFO org.primeframework.mvc.PrimeMVCRequestHandler - Shutting down Prime MVC

      2025-03-04 11:57:51.750 AM INFO org.primeframework.mvc.netty.PrimeHTTPServer - Gracefully closing the server resources

      2025-03-04 11:57:51.751 AM ERROR org.primeframework.mvc.guice.GuiceBootstrap - Unable to shutdown Closeable [Key[type=org.apache.ibatis.session.SqlSessionManager, annotation=[none]]]

      mark.robustelliM 2 Replies Last reply Reply Quote 0
      • mark.robustelliM
        mark.robustelli @prithwhat
        last edited by

        @prithwhat It looks like the resources are well above the min. so not sure it has to do with that. I will search around a bit to see if I can learn anything from the logs you provided.

        I assume you have seen our docs on Deploying FusionAuth to Kubernetes, but it may make sense to take some time to review and see if anything jumps out at you.

        1 Reply Last reply Reply Quote 0
        • mark.robustelliM
          mark.robustelli @prithwhat
          last edited by

          @prithwhat The line

          2025-03-04 11:50:38.245 AM INFO io.fusionauth.api.service.system.NodeService - Node [27af95a4-74ca-4d4f-92dc-a06306e7a597] promoted to master at [2025-03-04T11:50:38.245353677Z], the previous master Node [80222721-88f3-4b24-85f4-3af56b57732d] has been shutdown or removed

          is interesting. Do you have any sort of health checks that restart a node if it can't be accessed? I did see some info around the timeout being upped to 60 seconds instead of 30 and that helped with their issue.

          P 1 Reply Last reply Reply Quote 0
          • P
            prithwhat @mark.robustelli
            last edited by

            @mark-robustelli Hey first of all thanks for taking time out to check this

            The chart does have a default liveness, readiness and startup probe specified

            livenessProbe:
            httpGet:
            path: /
            port: http
            failureThreshold: 3
            periodSeconds: 30
            timeoutSeconds: 5

            readinessProbe -- Configures a readinessProbe to ensure fusionauth is ready for requests

            readinessProbe:
            httpGet:
            path: /
            port: http
            failureThreshold: 5
            timeoutSeconds: 5

            startupProbe -- Configures a startupProbe to ensure fusionauth has finished starting up

            startupProbe:
            httpGet:
            path: /
            port: http
            failureThreshold: 20
            periodSeconds: 10
            timeoutSeconds: 5

            But I'm not sure this is the issue because in that case the app would not start in the first place, and if left unused it will stay healthy and only when i start using the admin console is that it starts to shut down

            mark.robustelliM 1 Reply Last reply Reply Quote 0
            • mark.robustelliM
              mark.robustelli @prithwhat
              last edited by

              @prithwhat OK, so in the log you posted above

              2025-03-04 11:50:39.748 AM INFO org.primeframework.mvc.netty.PrimeHTTPServer - Starting FusionAuth HTTP loopback server on port [9012]
              
              2025-03-04 11:57:51.749 AM INFO org.primeframework.mvc.netty.PrimeHTTPServer - Shutting down the Prime HTTP server [/0.0.0.0:9011]
              

              Was it about 7 minutes between the time you spun up FusionAuth and when you tried to login to the admin UI?

              P 1 Reply Last reply Reply Quote 0
              • P
                prithwhat @mark.robustelli
                last edited by

                @mark-robustelli

                It's possible, I didn't really keep track of that per se

                We eventually moved on from this version and jumped to 1.40.x and surprisingly there were no issues in that version so this post can be closed

                That being said, I did face a similar issue in a newer version for which I might post another thread

                1 Reply Last reply Reply Quote 0
                • First post
                  Last post