Configuring a cluster

Hello experts!

There are two hosts: HV1 and HV2.

On each of the hosts, Job Server and Job Server DB: JS1 and JS2, DB1 and DB2.

Failover Cluster Windows 2019 was raised, in which JS and DB are combined. And we have JS-Cluster and DB-Cluster

We use SQL Always On(standard)

Difficulties with configuring a cluster in IDM:

1. In JobQueueInfo our clusters with errors
2. If we try to go to the web, then there is nothing there. Even if we go to 127.0.0.1:1880

How do I fix this? I looked at the instructions, but did not help. What am I missing?

Parents
  • Logging severity: Info.
    <i>2021-04-27 12:00:36 +03:00 - Info: Requesting process steps for queue \DB-CL.<x>
    <i>2021-04-27 12:00:36 +03:00 - Info: Deleting log file JobService.log_2021-03-19-12-00-00 ...<x>
    <r>2021-04-27 12:00:36 +03:00 - Serious: Last process step request failed with error: '[810023] Error during execution of statement: exec QBM_PJobQueueLoad N'\DB-CL', 90, -1, 1180940955, N'', N'a3af52ab-28d1-462c-ad3a-534b607cea4d'-->[810143] Database error 50000: detected in (SRV=DB2, DB=OneIM) Procedure QBM_PJobQueueLoad, Line 42-->[810143] Database error 50000: Session ID for queue \DB-CL does not match. DB: 49f0dc9c-59c3-4df9-9f30-db52205b2cd3 Query:a3af52ab-28d1-462c-ad3a-534b607cea4d.'!<x>
    <e>2021-04-27 12:00:36 +03:00 - Error occurred in DbRequestQueue.Process (thread: Database Job Requests):
    [821002] Error requesting queue '\DB-CL' for database 'db-ao.my_domain_name\OneIM'.
    [810023] Error during execution of statement: exec QBM_PJobQueueLoad N'\DB-CL', 90, -1, 1180940955, N'', N'a3af52ab-28d1-462c-ad3a-534b607cea4d'
    [810143] Database error 50000: detected in (SRV=DB2, DB=OneIM) Procedure QBM_PJobQueueLoad, Line 42
    [810143] Database error 50000: Session ID for queue \DB-CL does not match. DB: 49f0dc9c-59c3-4df9-9f30-db52205b2cd3 Query:a3af52ab-28d1-462c-ad3a-534b607cea4d.<x>
    <d> at VI.JobService.DbProvider.DbRequestQueue.Process(ProviderRequest request)
    at VI.JobService.MSSqlJobProvider._MsSqlRequestQueue._HandleGetJobs(IDbSession dbSession, GetJobsProviderRequest request)
    ---- Start of Inner Exception ----
    at VI.JobService.MSSqlJobProvider._MsSqlRequestQueue._HandleGetJobs(IDbSession dbSession, GetJobsProviderRequest request)
    at VI.Base.SyncActions.Do[T](Func`1 function)
    at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
    ---- Start of Inner Exception ----
    at VI.DB.DataAccess.ReadOnlyDbSession.<SqlExecuteAsync>d__38.MoveNext()
    ---- Start of Inner Exception ----
    at VI.DB.DataAccess.ReadOnlyDbSession.<SqlExecuteAsync>d__38.MoveNext()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    --- End of stack trace from previous location where exception was thrown ---
    at VI.DB.DataAccess.ReadWriteDbSession.<IgnoreBrokenConnectionAsync>d__48`1.MoveNext()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    --- End of stack trace from previous location where exception was thrown ---
    at VI.DB.DataAccess.ReadOnlyDbSession.<>c__DisplayClass38_0.<<SqlExecuteAsync>b__0>d.MoveNext()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    --- End of stack trace from previous location where exception was thrown ---
    at VI.DB.DataAccess.SafeDbCommand.<_CheckedAsync>d__40`1.MoveNext()
    ---- Start of Inner Exception ----
    <x>

  • Hi Boris,

    the shown error "Session ID for queue <queue name> does not match" indicates, that both Jobservice Instances DB1 and DB2 of the DB-Cluster are running.
    When running Jobservices at a fail over cluster, the cluster has to be configured that only the Jobservice on the active node is running. The Jobservices on all standby nodes should be stopped.

    Regarding the other issues:
    The JobQueueInfo checks for the jobservice instances by a heartbeat ping to the jobservices web interface. In case of an failover cluster the full qualified name of the failover cluster should be entered as FQDN of the jobservice in the database.

    If the Jobservice web interface is not reachable missing permission are likely the cause. Check if the account the Jobservice is running as has the permission to allocate the configured port. (Keywords: netsh add/show urlacl)

  • According to the screenshot i11 the urlacl refers to https but your job services web interface uses http. I don't know if this might cause an issue. You might want to also add http://*:1880/. This might require to remove the https://*:1880/.

    Is the web interface accessible by browser from any machine?

    According to the image i3 you specified an IP address to listen to. Does it work if the field is left blank? (you might need to restart the job service)

    Does the job services log show any error regarding http when you restart the job service?

  • - I turned it off in the Designer and on the all servers, Use SSL

    - Delete https://*:1880/

    - Add http://*:1880/

    - Restart Service

    Its Worked!!! Thanks!!!

    -------------------------------------

    On all servers, the service is in manual mode. If I change the node through the failover cluster, the job is again an error. But if I translate it back, then everything is fine.

    It turns out that clustering does not work out. What should be done?

    <i>2021-04-28 09:24:13 +03:00 - Info: Requesting process steps for queue \JS-CL.<x>
    <r>2021-04-28 09:24:43 +03:00 - Serious: Last process step request failed with error: '[810143] Database error 53: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)-->[System.Data.SqlClient.SqlException] A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)-->[System.ComponentModel.Win32Exception] The network path was not found'!<x>
    <e>2021-04-28 09:24:43 +03:00 - Error occurred in DbRequestQueue.Process (thread: Database Job Results):
    [810143] Database error 53: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)
    [System.Data.SqlClient.SqlException] A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)
    [System.ComponentModel.Win32Exception] The network path was not found<x>
    <d> at VI.JobService.DbProvider.DbRequestQueue.Process(ProviderRequest request)
    at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
    ---- Start of Inner Exception ----
    at VI.DB.DataAccess.DbSessionFactoryImpl.<CreateAsync>d__3.MoveNext()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    --- End of stack trace from previous location where exception was thrown ---
    at VI.DB.DataAccess.PhysicalConnectionPool.<CreateAsync>d__9.MoveNext()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    --- End of stack trace from previous location where exception was thrown ---
    at VI.DB.DataAccess.PhysicalConnectionPool.<GetAsync>d__27.MoveNext()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    --- End of stack trace from previous location where exception was thrown ---
    at VI.DB.DataAccess.PhysicalConnectionPool.<_CreateNewBucketAsync>d__30.MoveNext()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    --- End of stack trace from previous location where exception was thrown ---
    at VI.DB.DataAccess.PhysicalConnectionPool._Bucket.<CreateAsync>d__11.MoveNext()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    --- End of stack trace from previous location where exception was thrown ---
    at VI.DB.DataAccess.PhysicalConnectionPool._Bucket.<TryInitializeAsync>d__15.MoveNext()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    --- End of stack trace from previous location where exception was thrown ---
    at VI.DB.DataAccess.DbFactoryBase.<_CreateAndOpenConnectionAsync>d__13.MoveNext()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    --- End of stack trace from previous location where exception was thrown ---
    at VI.DB.DataAccess.PhysicalMsSqlConnection.<OpenAsync>d__19.MoveNext()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    --- End of stack trace from previous location where exception was thrown ---
    at VI.DB.DataAccess.PhysicalConnectionBase.<OpenAsync>d__16.MoveNext()
    ---- Start of Inner Exception ----
    at VI.DB.DataAccess.PhysicalConnectionBase.<OpenAsync>d__16.MoveNext()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    --- End of stack trace from previous location where exception was thrown ---
    at System.Threading.Tasks.Task.Execute()
    at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()
    at System.Data.ProviderBase.DbConnectionFactory.<>c__DisplayClass31_0.<TryGetConnection>b__0(Task`1 _)
    at System.Data.ProviderBase.DbConnectionFactory.CreateNonPooledConnection(DbConnection owningConnection, DbConnectionPoolGroup poolGroup, DbConnectionOptions userOptions)
    at System.Data.SqlClient.SqlConnectionFactory.CreateConnection(DbConnectionOptions options, DbConnectionPoolKey poolKey, Object poolGroupProviderInfo, DbConnectionPool pool, DbConnection owningConnection, DbConnectionOptions userOptions)
    at System.Data.SqlClient.SqlInternalConnectionTds..ctor(DbConnectionPoolIdentity identity, SqlConnectionString connectionOptions, SqlCredential credential, Object providerInfo, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance, SqlConnectionString userConnectionOptions, SessionData reconnectSessionData, DbConnectionPool pool, String accessToken, Boolean applyTransientFaultHandling, SqlAuthenticationProviderManager sqlAuthProviderManager)
    ---- Start of Inner Exception ----
    <x>
    <i>2021-04-28 09:25:43 +03:00 - Info: Requesting process steps for queue \JS-CL.<x>

  • Hi Boris,

    there is something wrong with the second cluster nodes configuration. I'd guess the Jobservice on the second node is blocked from network access, because it seems neither outgoing traffic (SQL Server connection) nor incoming traffic (web interface) is working.

  • All servers are running. The server running the service is listening on port 1880.

    But if I change the node through the failover claster. Then the real node is not visible through the http://JS-CL:1880 web address. Moreover, if I write the address in the browser http://JS1:1880, then the service is available.

    If I turn off the server, then the node switches and everything works http://JS-CL:1880.

  • It looks like your manuell failover does not stop the Jobservice on node JS1 and does not start the Jobservice on JS2. But it looks like taking the first node offline causes a proper failover. There might be further configuration required. Unfortunately I'm not familiar with Windows server failover cluster configuration and can't assist with advise regarding configurations details.
    You might want to ask support for help. I'm almost certain there is a document regarding failover cluster configuration.

  • Thank you!

    I will write to support. If a solution to my problem appears, I will write here.

  • We contacted support, but there is no more detailed instructions.

    I have enabled the logs on the job servers that we have in the cluster. The correct switching from node to node does not take place there. Maybe you can decipher our log and tell me what to fix?

    Logging severity: Warning.
    <e>2021-06-09 10:46:48 +03:00 - M2IDMJS-ONEIMSE - Error occurred in JobService.Initialize (thread: <Unknown>):
    [821045] Could not create job provider sqlprovider.
    [System.Reflection.TargetInvocationException] Exception has been thrown by the target of an invocation.
    [809012] Error reading configuration value ConnectString.
    [809004] Could not get value ConnectString.
    [809003] Error encrypting value.
    [System.Security.Cryptography.CryptographicException] Key not valid for use in specified state.
    <x>
    <d> at VI.JobService.JobService._InitializeJobProviders()
    at System.Activator.CreateInstance(Type type, Object[] args)
    at System.Activator.CreateInstance(Type type, BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture, Object[] activationAttributes)
    at System.RuntimeType.CreateInstanceImpl(BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture, Object[] activationAttributes, StackCrawlMark& stackMark)
    at System.Reflection.RuntimeConstructorInfo.Invoke(BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
    at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor)
    ---- Start of Inner Exception ----
    at VI.JobService.MSSqlJobProvider..ctor(RequestDispatcher dispatcher, String id)
    at VI.Base.ConfigSettings._Init()
    at VI.Base.ConfigSettings._ReadConfig()
    ---- Start of Inner Exception ----
    at VI.Base.ConfigSettings._ReadConfig()
    at VI.Base.ConfigDataExtensions.CheckRequiredParameters(IConfigData category, String[] parameters)
    at VI.Base.EncryptedConfigData.Get(String valueName)
    ---- Start of Inner Exception ----
    at VI.Base.EncryptedConfigData.Get(String valueName)
    at VI.Base.EncryptedConfigScope._Decode(String data)
    ---- Start of Inner Exception ----
    at VI.Base.EncryptedConfigScope._Decode(String data)
    at System.Security.Cryptography.ProtectedData.Unprotect(Byte[] encryptedData, Byte[] optionalEntropy, DataProtectionScope scope)<x>
    <r>2021-06-09 10:46:48 +03:00 - M2IDMJS-ONEIMSE - Serious: No job providers configured.<x>
    <r>2021-06-09 10:46:48 +03:00 - M2IDMJS-ONEIMSE - Serious: Provider value names: sqlprovider<x>
    <w>2021-06-09 10:46:48 +03:00 - M2IDMJS-ONEIMSE - Warning: The service has no write permissions for its own directory. These are required for automatic updates.<x>
    <e>2021-06-09 10:46:48 +03:00 - M2IDMJS-ONEIMSE - Error occurred in Job Service (thread: <Unknown>):
    [821049] Error starting One Identity Manager Service.
    [System.Exception] No job provider configured.<x>
    <d> at VI.JobService.JobService._StartJobService()<x>
    <w>2021-06-09 10:46:48 +03:00 - M2IDMJS-ONEIMSE - Warning: Error starting the service. Retrying after 00:01:30.<x>
    <e>2021-06-09 10:49:36 +03:00 - M2IDMJS-ONEIMSE - Error occurred in JobService.Initialize (thread: <Unknown>):
    [821045] Could not create job provider sqlprovider.
    [System.Reflection.TargetInvocationException] Exception has been thrown by the target of an invocation.
    [809012] Error reading configuration value ConnectString.
    [809004] Could not get value ConnectString.
    [809003] Error encrypting value.
    [System.Security.Cryptography.CryptographicException] Key not valid for use in specified state.
    <x>
    <d> at VI.JobService.JobService._InitializeJobProviders()
    at System.Activator.CreateInstance(Type type, Object[] args)
    at System.Activator.CreateInstance(Type type, BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture, Object[] activationAttributes)
    at System.RuntimeType.CreateInstanceImpl(BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture, Object[] activationAttributes, StackCrawlMark& stackMark)
    at System.Reflection.RuntimeConstructorInfo.Invoke(BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
    at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor)
    ---- Start of Inner Exception ----
    at VI.JobService.MSSqlJobProvider..ctor(RequestDispatcher dispatcher, String id)
    at VI.Base.ConfigSettings._Init()
    at VI.Base.ConfigSettings._ReadConfig()
    ---- Start of Inner Exception ----
    at VI.Base.ConfigSettings._ReadConfig()
    at VI.Base.ConfigDataExtensions.CheckRequiredParameters(IConfigData category, String[] parameters)
    at VI.Base.EncryptedConfigData.Get(String valueName)
    ---- Start of Inner Exception ----
    at VI.Base.EncryptedConfigData.Get(String valueName)
    at VI.Base.EncryptedConfigScope._Decode(String data)
    ---- Start of Inner Exception ----
    at VI.Base.EncryptedConfigScope._Decode(String data)
    at System.Security.Cryptography.ProtectedData.Unprotect(Byte[] encryptedData, Byte[] optionalEntropy, DataProtectionScope scope)<x>
    <r>2021-06-09 10:49:36 +03:00 - M2IDMJS-ONEIMSE - Serious: No job providers configured.<x>
    <r>2021-06-09 10:49:36 +03:00 - M2IDMJS-ONEIMSE - Serious: Provider value names: sqlprovider<x>
    <w>2021-06-09 10:49:36 +03:00 - M2IDMJS-ONEIMSE - Warning: The service has no write permissions for its own directory. These are required for automatic updates.<x>
    <e>2021-06-09 10:49:36 +03:00 - M2IDMJS-ONEIMSE - Error occurred in Job Service (thread: <Unknown>):
    [821049] Error starting One Identity Manager Service.
    [System.Exception] No job provider configured.<x>
    <d> at VI.JobService.JobService._StartJobService()<x>
    <w>2021-06-09 10:49:36 +03:00 - M2IDMJS-ONEIMSE - Warning: Error starting the service. Retrying after 00:01:30.<x>
    <e>2021-06-09 10:51:06 +03:00 - M2IDMJS-ONEIMSE - Error occurred in Job Service (thread: <Unknown>):
    [821049] Error starting One Identity Manager Service.
    [System.Exception] No job provider configured.<x>
    <d> at VI.JobService.JobService._StartJobService()<x>
    <w>2021-06-09 10:51:06 +03:00 - M2IDMJS-ONEIMSE - Warning: Error starting the service. Retrying after 00:01:30.<x>

  • Error reading configuration value ConnectString.
    [809004] Could not get value ConnectString.
    [809003] Error encrypting value.
    [System.Security.Cryptography.CryptographicException] Key not valid for use in specified state.

    This means in most cases, that the Job Service user has no access to the private.key file used to encrypt the database.

    r>2021-06-09 10:46:48 +03:00 - M2IDMJS-ONEIMSE - Serious: No job providers configured.<x>
    <r>2021-06-09 10:46:48 +03:00 - M2IDMJS-ONEIMSE - Serious: Provider value names: sqlprovider<x>
    <w>2021-06-09 10:46:48 +03:00 - M2IDMJS-ONEIMSE - Warning: The service has no write permissions for its own directory. These are required for automatic updates.<x>

    The Job Service needs to have write access to its own directory.

  • This means in most cases, that the Job Service user has no access to the private.key file used to encrypt the database.

    And where should the private.key be placed (in which folder)? I thought when I installed the service and the installer was pointing the way to the key, he was copying it.

  • According to the error the JobService can not read or decrypt the connection string stored in the JobService configuration file.
    On startup unencrypted sensitiv information is encrypted by the JobService. The key is stored using a Windows API bound to the machine and account.

    Reusing a JobService configuration file for a different account or on a different machine causes this error.
    If you copied the configuration from the other node, edit it and reenter the connection information.
    If you use a shared binary directory, reenter the connection information and activate the "Do not protect encrypted configuration" option. This option has side effects that need to be considered.

Reply
  • According to the error the JobService can not read or decrypt the connection string stored in the JobService configuration file.
    On startup unencrypted sensitiv information is encrypted by the JobService. The key is stored using a Windows API bound to the machine and account.

    Reusing a JobService configuration file for a different account or on a different machine causes this error.
    If you copied the configuration from the other node, edit it and reenter the connection information.
    If you use a shared binary directory, reenter the connection information and activate the "Do not protect encrypted configuration" option. This option has side effects that need to be considered.

Children
No Data