Configuring a cluster

Hello experts!

There are two hosts: HV1 and HV2.

On each of the hosts, Job Server and Job Server DB: JS1 and JS2, DB1 and DB2.

Failover Cluster Windows 2019 was raised, in which JS and DB are combined. And we have JS-Cluster and DB-Cluster

We use SQL Always On(standard)

Difficulties with configuring a cluster in IDM:

1. In JobQueueInfo our clusters with errors
2. If we try to go to the web, then there is nothing there. Even if we go to 127.0.0.1:1880

How do I fix this? I looked at the instructions, but did not help. What am I missing?

Parents
  • Logging severity: Info.
    <i>2021-04-27 12:00:36 +03:00 - Info: Requesting process steps for queue \DB-CL.<x>
    <i>2021-04-27 12:00:36 +03:00 - Info: Deleting log file JobService.log_2021-03-19-12-00-00 ...<x>
    <r>2021-04-27 12:00:36 +03:00 - Serious: Last process step request failed with error: '[810023] Error during execution of statement: exec QBM_PJobQueueLoad N'\DB-CL', 90, -1, 1180940955, N'', N'a3af52ab-28d1-462c-ad3a-534b607cea4d'-->[810143] Database error 50000: detected in (SRV=DB2, DB=OneIM) Procedure QBM_PJobQueueLoad, Line 42-->[810143] Database error 50000: Session ID for queue \DB-CL does not match. DB: 49f0dc9c-59c3-4df9-9f30-db52205b2cd3 Query:a3af52ab-28d1-462c-ad3a-534b607cea4d.'!<x>
    <e>2021-04-27 12:00:36 +03:00 - Error occurred in DbRequestQueue.Process (thread: Database Job Requests):
    [821002] Error requesting queue '\DB-CL' for database 'db-ao.my_domain_name\OneIM'.
    [810023] Error during execution of statement: exec QBM_PJobQueueLoad N'\DB-CL', 90, -1, 1180940955, N'', N'a3af52ab-28d1-462c-ad3a-534b607cea4d'
    [810143] Database error 50000: detected in (SRV=DB2, DB=OneIM) Procedure QBM_PJobQueueLoad, Line 42
    [810143] Database error 50000: Session ID for queue \DB-CL does not match. DB: 49f0dc9c-59c3-4df9-9f30-db52205b2cd3 Query:a3af52ab-28d1-462c-ad3a-534b607cea4d.<x>
    <d> at VI.JobService.DbProvider.DbRequestQueue.Process(ProviderRequest request)
    at VI.JobService.MSSqlJobProvider._MsSqlRequestQueue._HandleGetJobs(IDbSession dbSession, GetJobsProviderRequest request)
    ---- Start of Inner Exception ----
    at VI.JobService.MSSqlJobProvider._MsSqlRequestQueue._HandleGetJobs(IDbSession dbSession, GetJobsProviderRequest request)
    at VI.Base.SyncActions.Do[T](Func`1 function)
    at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
    ---- Start of Inner Exception ----
    at VI.DB.DataAccess.ReadOnlyDbSession.<SqlExecuteAsync>d__38.MoveNext()
    ---- Start of Inner Exception ----
    at VI.DB.DataAccess.ReadOnlyDbSession.<SqlExecuteAsync>d__38.MoveNext()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    --- End of stack trace from previous location where exception was thrown ---
    at VI.DB.DataAccess.ReadWriteDbSession.<IgnoreBrokenConnectionAsync>d__48`1.MoveNext()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    --- End of stack trace from previous location where exception was thrown ---
    at VI.DB.DataAccess.ReadOnlyDbSession.<>c__DisplayClass38_0.<<SqlExecuteAsync>b__0>d.MoveNext()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    --- End of stack trace from previous location where exception was thrown ---
    at VI.DB.DataAccess.SafeDbCommand.<_CheckedAsync>d__40`1.MoveNext()
    ---- Start of Inner Exception ----
    <x>

  • Hi Boris,

    the shown error "Session ID for queue <queue name> does not match" indicates, that both Jobservice Instances DB1 and DB2 of the DB-Cluster are running.
    When running Jobservices at a fail over cluster, the cluster has to be configured that only the Jobservice on the active node is running. The Jobservices on all standby nodes should be stopped.

    Regarding the other issues:
    The JobQueueInfo checks for the jobservice instances by a heartbeat ping to the jobservices web interface. In case of an failover cluster the full qualified name of the failover cluster should be entered as FQDN of the jobservice in the database.

    If the Jobservice web interface is not reachable missing permission are likely the cause. Check if the account the Jobservice is running as has the permission to allocate the configured port. (Keywords: netsh add/show urlacl)

  • It looks like your manuell failover does not stop the Jobservice on node JS1 and does not start the Jobservice on JS2. But it looks like taking the first node offline causes a proper failover. There might be further configuration required. Unfortunately I'm not familiar with Windows server failover cluster configuration and can't assist with advise regarding configurations details.
    You might want to ask support for help. I'm almost certain there is a document regarding failover cluster configuration.

  • Thank you!

    I will write to support. If a solution to my problem appears, I will write here.

  • We contacted support, but there is no more detailed instructions.

    I have enabled the logs on the job servers that we have in the cluster. The correct switching from node to node does not take place there. Maybe you can decipher our log and tell me what to fix?

    Logging severity: Warning.
    <e>2021-06-09 10:46:48 +03:00 - M2IDMJS-ONEIMSE - Error occurred in JobService.Initialize (thread: <Unknown>):
    [821045] Could not create job provider sqlprovider.
    [System.Reflection.TargetInvocationException] Exception has been thrown by the target of an invocation.
    [809012] Error reading configuration value ConnectString.
    [809004] Could not get value ConnectString.
    [809003] Error encrypting value.
    [System.Security.Cryptography.CryptographicException] Key not valid for use in specified state.
    <x>
    <d> at VI.JobService.JobService._InitializeJobProviders()
    at System.Activator.CreateInstance(Type type, Object[] args)
    at System.Activator.CreateInstance(Type type, BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture, Object[] activationAttributes)
    at System.RuntimeType.CreateInstanceImpl(BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture, Object[] activationAttributes, StackCrawlMark& stackMark)
    at System.Reflection.RuntimeConstructorInfo.Invoke(BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
    at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor)
    ---- Start of Inner Exception ----
    at VI.JobService.MSSqlJobProvider..ctor(RequestDispatcher dispatcher, String id)
    at VI.Base.ConfigSettings._Init()
    at VI.Base.ConfigSettings._ReadConfig()
    ---- Start of Inner Exception ----
    at VI.Base.ConfigSettings._ReadConfig()
    at VI.Base.ConfigDataExtensions.CheckRequiredParameters(IConfigData category, String[] parameters)
    at VI.Base.EncryptedConfigData.Get(String valueName)
    ---- Start of Inner Exception ----
    at VI.Base.EncryptedConfigData.Get(String valueName)
    at VI.Base.EncryptedConfigScope._Decode(String data)
    ---- Start of Inner Exception ----
    at VI.Base.EncryptedConfigScope._Decode(String data)
    at System.Security.Cryptography.ProtectedData.Unprotect(Byte[] encryptedData, Byte[] optionalEntropy, DataProtectionScope scope)<x>
    <r>2021-06-09 10:46:48 +03:00 - M2IDMJS-ONEIMSE - Serious: No job providers configured.<x>
    <r>2021-06-09 10:46:48 +03:00 - M2IDMJS-ONEIMSE - Serious: Provider value names: sqlprovider<x>
    <w>2021-06-09 10:46:48 +03:00 - M2IDMJS-ONEIMSE - Warning: The service has no write permissions for its own directory. These are required for automatic updates.<x>
    <e>2021-06-09 10:46:48 +03:00 - M2IDMJS-ONEIMSE - Error occurred in Job Service (thread: <Unknown>):
    [821049] Error starting One Identity Manager Service.
    [System.Exception] No job provider configured.<x>
    <d> at VI.JobService.JobService._StartJobService()<x>
    <w>2021-06-09 10:46:48 +03:00 - M2IDMJS-ONEIMSE - Warning: Error starting the service. Retrying after 00:01:30.<x>
    <e>2021-06-09 10:49:36 +03:00 - M2IDMJS-ONEIMSE - Error occurred in JobService.Initialize (thread: <Unknown>):
    [821045] Could not create job provider sqlprovider.
    [System.Reflection.TargetInvocationException] Exception has been thrown by the target of an invocation.
    [809012] Error reading configuration value ConnectString.
    [809004] Could not get value ConnectString.
    [809003] Error encrypting value.
    [System.Security.Cryptography.CryptographicException] Key not valid for use in specified state.
    <x>
    <d> at VI.JobService.JobService._InitializeJobProviders()
    at System.Activator.CreateInstance(Type type, Object[] args)
    at System.Activator.CreateInstance(Type type, BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture, Object[] activationAttributes)
    at System.RuntimeType.CreateInstanceImpl(BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture, Object[] activationAttributes, StackCrawlMark& stackMark)
    at System.Reflection.RuntimeConstructorInfo.Invoke(BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
    at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor)
    ---- Start of Inner Exception ----
    at VI.JobService.MSSqlJobProvider..ctor(RequestDispatcher dispatcher, String id)
    at VI.Base.ConfigSettings._Init()
    at VI.Base.ConfigSettings._ReadConfig()
    ---- Start of Inner Exception ----
    at VI.Base.ConfigSettings._ReadConfig()
    at VI.Base.ConfigDataExtensions.CheckRequiredParameters(IConfigData category, String[] parameters)
    at VI.Base.EncryptedConfigData.Get(String valueName)
    ---- Start of Inner Exception ----
    at VI.Base.EncryptedConfigData.Get(String valueName)
    at VI.Base.EncryptedConfigScope._Decode(String data)
    ---- Start of Inner Exception ----
    at VI.Base.EncryptedConfigScope._Decode(String data)
    at System.Security.Cryptography.ProtectedData.Unprotect(Byte[] encryptedData, Byte[] optionalEntropy, DataProtectionScope scope)<x>
    <r>2021-06-09 10:49:36 +03:00 - M2IDMJS-ONEIMSE - Serious: No job providers configured.<x>
    <r>2021-06-09 10:49:36 +03:00 - M2IDMJS-ONEIMSE - Serious: Provider value names: sqlprovider<x>
    <w>2021-06-09 10:49:36 +03:00 - M2IDMJS-ONEIMSE - Warning: The service has no write permissions for its own directory. These are required for automatic updates.<x>
    <e>2021-06-09 10:49:36 +03:00 - M2IDMJS-ONEIMSE - Error occurred in Job Service (thread: <Unknown>):
    [821049] Error starting One Identity Manager Service.
    [System.Exception] No job provider configured.<x>
    <d> at VI.JobService.JobService._StartJobService()<x>
    <w>2021-06-09 10:49:36 +03:00 - M2IDMJS-ONEIMSE - Warning: Error starting the service. Retrying after 00:01:30.<x>
    <e>2021-06-09 10:51:06 +03:00 - M2IDMJS-ONEIMSE - Error occurred in Job Service (thread: <Unknown>):
    [821049] Error starting One Identity Manager Service.
    [System.Exception] No job provider configured.<x>
    <d> at VI.JobService.JobService._StartJobService()<x>
    <w>2021-06-09 10:51:06 +03:00 - M2IDMJS-ONEIMSE - Warning: Error starting the service. Retrying after 00:01:30.<x>

  • Error reading configuration value ConnectString.
    [809004] Could not get value ConnectString.
    [809003] Error encrypting value.
    [System.Security.Cryptography.CryptographicException] Key not valid for use in specified state.

    This means in most cases, that the Job Service user has no access to the private.key file used to encrypt the database.

    r>2021-06-09 10:46:48 +03:00 - M2IDMJS-ONEIMSE - Serious: No job providers configured.<x>
    <r>2021-06-09 10:46:48 +03:00 - M2IDMJS-ONEIMSE - Serious: Provider value names: sqlprovider<x>
    <w>2021-06-09 10:46:48 +03:00 - M2IDMJS-ONEIMSE - Warning: The service has no write permissions for its own directory. These are required for automatic updates.<x>

    The Job Service needs to have write access to its own directory.

  • This means in most cases, that the Job Service user has no access to the private.key file used to encrypt the database.

    And where should the private.key be placed (in which folder)? I thought when I installed the service and the installer was pointing the way to the key, he was copying it.

  • According to the error the JobService can not read or decrypt the connection string stored in the JobService configuration file.
    On startup unencrypted sensitiv information is encrypted by the JobService. The key is stored using a Windows API bound to the machine and account.

    Reusing a JobService configuration file for a different account or on a different machine causes this error.
    If you copied the configuration from the other node, edit it and reenter the connection information.
    If you use a shared binary directory, reenter the connection information and activate the "Do not protect encrypted configuration" option. This option has side effects that need to be considered.

  • A copy of the private.key file has to be placed into the installation directory of the JobService.
    Private key files are consumed (and deleted) by the JobService and stored bound to account and machine (using a Windows API) on startup.

    When changing account or machine the private key file has to be copied into the installation directory of the Jobservice prior to it's startup. Per combination of account an machine this has to be done once.

    When using a shared installation directory the above is required once per cluster node or the option "Do not protect private keys" can be activated. This option has security implications!

  • The private.key file will be removed during startup if not configured otherwise. Are you trying to share the private.key between both instances?

  • What I've done:
    1. In the designer, I made a Job server, and checked the boxes on the Server Cluster and One Identity Server installed
    2.I installed the service through the designer on the necessary servers, during installation indicated the path to the distribution kit and the key
    3.Saved the configuration file from the designer indicating the queue
    4. I start the Windows service under a specially created account for each job server
    5.Gave full rights to this account to the folder C:\Program Files\One Identity
    6. Copied the configuration file to each job server. Applied. I saved it. Restart the service

    PS. Data Base is encrypted


  • I copied the key to the program directory, it disappears. But the problem remained

Reply Children
  • [System.Reflection.TargetInvocationException] Exception has been thrown by the target of an invocation.
    [809012] Error reading configuration value ConnectString.
    [809004] Could not get value ConnectString.
    [809003] Error encrypting value.
    [System.Security.Cryptography.CryptographicException] Key not valid for use in specified state.
    <x>
    <d> at VI.JobService.JobService._InitializeJobProviders()
    at System.Activator.CreateInstance(Type type, Object[] args)
    at System.Activator.CreateInstance(Type type, BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture, Object[] activationAttributes)
    at System.RuntimeType.CreateInstanceImpl(BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture, Object[] activationAttributes, StackCrawlMark& stackMark)
    at System.Reflection.RuntimeConstructorInfo.Invoke(BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
    at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor)
    ---- Start of Inner Exception ----
    at VI.JobService.MSSqlJobProvider..ctor(RequestDispatcher dispatcher, String id)
    at VI.Base.ConfigSettings._Init()
    at VI.Base.ConfigSettings._ReadConfig()
    ---- Start of Inner Exception ----
    at VI.Base.ConfigSettings._ReadConfig()
    at VI.Base.ConfigDataExtensions.CheckRequiredParameters(IConfigData category, String[] parameters)
    at VI.Base.EncryptedConfigData.Get(String valueName)
    ---- Start of Inner Exception ----
    at VI.Base.EncryptedConfigData.Get(String valueName)
    at VI.Base.EncryptedConfigScope._Decode(String data)
    ---- Start of Inner Exception ----
    at VI.Base.EncryptedConfigScope._Decode(String data)
    at System.Security.Cryptography.ProtectedData.Unprotect(Byte[] encryptedData, Byte[] optionalEntropy, DataProtectionScope scope)<x>
    <r>2021-06-09 14:13:41 +03:00 - M2IDMJS-ONEIMSE - Serious: No job providers configured.<x>
    <r>2021-06-09 14:13:41 +03:00 - M2IDMJS-ONEIMSE - Serious: Provider value names: sqlprovider<x>
    <e>2021-06-09 14:13:41 +03:00 - M2IDMJS-ONEIMSE - Error occurred in Job Service (thread: <Unknown>):
    [821049] Error starting One Identity Manager Service.
    [System.Exception] No job provider configured.<x>
    <d> at VI.JobService.JobService._StartJobService()<x>
    <w>2021-06-09 14:13:41 +03:00 - M2IDMJS-ONEIMSE - Warning: Error starting the service. Retrying after 00:01:30.<x>

  • Did you follow these steps from Andreas?

    - Reusing a JobService configuration file for a different account or on a different machine causes this error.
    - If you copied the configuration from the other node, edit it and reenter the connection information.

  • For simplicity, I disabled encryption of the database and reconfigured the server job. Now the logs contain the following errors:

    2021-06-11 13:48:17 +03:00 - JS-ONEIMSE - Serious: Last process step request failed with error: '[810023] Error during execution of statement: exec QBM_PJobQueueLoad N'\JS-CL', 90, -1, 1180940955, N'', N'7b485089-fbc0-4bfb-9f6b-8987b1fcc3f8'-->[810143] Database error 50000: detected in (SRV=MDB2, DB=OneIM) Procedure QBM_PJobQueueLoad, Line 42-->[810143] Database error 50000: Session ID for queue \JS-CL does not match. DB: 8a42acf6-b97c-455b-8738-bc631e28597b Query:7b485089-fbc0-4bfb-9f6b-8987b1fcc3f8.'!
    2021-06-11 13:48:17 +03:00 - JS-ONEIMSE - Error occurred in DbRequestQueue.Process (thread: Database Job Requests):
    [821002] Error requesting queue '\JS-CL' for database 'DB-AO.test\OneIM'.
    [810023] Error during execution of statement: exec QBM_PJobQueueLoad N'\JS-CL', 90, -1, 1180940955, N'', N'7b485089-fbc0-4bfb-9f6b-8987b1fcc3f8'
    [810143] Database error 50000: detected in (SRV=DB2, DB=OneIM) Procedure QBM_PJobQueueLoad, Line 42
    [810143] Database error 50000: Session ID for queue \JS-CL does not match. DB: 8a42acf6-b97c-455b-8738-bc631e28597b Query:7b485089-fbc0-4bfb-9f6b-8987b1fcc3f8.
    at VI.JobService.DbProvider.DbRequestQueue.Process(ProviderRequest request)
    at VI.JobService.MSSqlJobProvider._MsSqlRequestQueue._HandleGetJobs(IDbSession dbSession, GetJobsProviderRequest request)
    ---- Start of Inner Exception ----
    at VI.JobService.MSSqlJobProvider._MsSqlRequestQueue._HandleGetJobs(IDbSession dbSession, GetJobsProviderRequest request)
    at VI.Base.SyncActions.Do[T](Func`1 function)
    at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
    ---- Start of Inner Exception ----
    at VI.DB.DataAccess.ReadOnlyDbSession.<SqlExecuteAsync>d__38.MoveNext()
    ---- Start of Inner Exception ----
    at VI.DB.DataAccess.ReadOnlyDbSession.<SqlExecuteAsync>d__38.MoveNext()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    --- End of stack trace from previous location where exception was thrown ---
    at VI.DB.DataAccess.ReadWriteDbSession.<IgnoreBrokenConnectionAsync>d__48`1.MoveNext()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    --- End of stack trace from previous location where exception was thrown ---
    at VI.DB.DataAccess.ReadOnlyDbSession.<>c__DisplayClass38_0.<<SqlExecuteAsync>b__0>d.MoveNext()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    --- End of stack trace from previous location where exception was thrown ---
    at VI.DB.DataAccess.SafeDbCommand.<_CheckedAsync>d__40`1.MoveNext()
    ---- Start of Inner Exception ----

  • You currently have more than one JobService Instance running that fetches jobs for the queue "\JS-CL"

  • How can this be fixed?
    I have verified that the service is only running on one node.

    What the problem looks like. If the active server is JS1, then you can go through http to http://js1:1880 and http://js-cluster:1880

    But if I manually switch to JS2, then this address works http://js2:1880,

     and this one does not http://js-cluster:1880 do not open and, accordingly, errors appear in the logs (my previous post).

  • When a JobService ist stated it initializes the queue it is configured for. This includes setting a session ID. The JobService that initialized last holds the currently valid session ID.
    If JS2 is started and supposed to run but JS1 was started intermediately, JS2's SessionID gets invalidated. JS2 needs to be restarted, to become the owner of a new valid Session ID

  • I have restarted services IDM on every node.

    I rebooted each server both in turn and together.
    Through the Failover cluster: changed nodes, turned off nodes, migrated from one node to another.

    What else do I need to try?

  • It turns out that the JS1 and JS2 servers are working when switching. But JS-Cluster works correctly only with JS1.