🚀 Try Zilliz Cloud, the fully managed Milvus, for free—experience 10x faster performance! Try Now>>

milvus-logo
LFAI
Home
  • Administration Guide

queryNode-related Configurations

Related configuration of queryNode, used to run hybrid search between vector and scalar data.

queryNode.stats.publishInterval

Description Default Value
The interval that query node publishes the node statistics information, including segment status, cpu usage, memory usage, health status, etc. Unit: ms. 1000

queryNode.segcore.knowhereThreadPoolNumRatio

Description Default Value
The number of threads in knowhere's thread pool. If disk is enabled, the pool size will multiply with knowhereThreadPoolNumRatio([1, 32]). 4

queryNode.segcore.chunkRows

Description Default Value
Row count by which Segcore divides a segment into chunks. 128

queryNode.segcore.interimIndex.enableIndex

Description Default Value
  • Whether to create a temporary index for growing segments and sealed segments not yet indexed, improving search performance.
  • Milvus will eventually seals and indexes all segments, but enabling this optimizes search performance for immediate queries following data insertion.
  • This defaults to true, indicating that Milvus creates temporary index for growing segments and the sealed segments that are not indexed upon searches.
  • true

    queryNode.segcore.interimIndex.nlist

    Description Default Value
    temp index nlist, recommend to set sqrt(chunkRows), must smaller than chunkRows/8 128

    queryNode.segcore.interimIndex.nprobe

    Description Default Value
    nprobe to search small index, based on your accuracy requirement, must smaller than nlist 16

    queryNode.segcore.interimIndex.memExpansionRate

    Description Default Value
    extra memory needed by building interim index 1.15

    queryNode.segcore.interimIndex.buildParallelRate

    Description Default Value
    the ratio of building interim index parallel matched with cpu num 0.5

    queryNode.segcore.multipleChunkedEnable

    Description Default Value
    Enable multiple chunked search true

    queryNode.segcore.knowhereScoreConsistency

    Description Default Value
    Enable knowhere strong consistency score computation logic false

    queryNode.loadMemoryUsageFactor

    Description Default Value
    The multiply factor of calculating the memory usage while loading segments 1

    queryNode.enableDisk

    Description Default Value
    enable querynode load disk index, and search on disk index false

    queryNode.cache.memoryLimit

    Description Default Value
    2 GB, 2 * 1024 *1024 *1024 2147483648

    queryNode.cache.readAheadPolicy

    Description Default Value
    The read ahead policy of chunk cache, options: `normal, random, sequential, willneed, dontneed` willneed

    queryNode.cache.warmup

    Description Default Value
  • options: async, sync, disable.
  • Specifies the necessity for warming up the chunk cache.
  • 1. If set to "sync" or "async" the original vector data will be synchronously/asynchronously loaded into the
  • chunk cache during the load process. This approach has the potential to substantially reduce query/search latency
  • for a specific duration post-load, albeit accompanied by a concurrent increase in disk usage;
  • 2. If set to "disable" original vector data will only be loaded into the chunk cache during search/query.
  • disable

    queryNode.mmap.vectorField

    Description Default Value
    Enable mmap for loading vector data false

    queryNode.mmap.vectorIndex

    Description Default Value
    Enable mmap for loading vector index false

    queryNode.mmap.scalarField

    Description Default Value
    Enable mmap for loading scalar data false

    queryNode.mmap.scalarIndex

    Description Default Value
    Enable mmap for loading scalar index false

    queryNode.mmap.chunkCache

    Description Default Value
    Enable mmap for chunk cache (raw vector retrieving). true

    queryNode.mmap.growingMmapEnabled

    Description Default Value
  • Enable memory mapping (mmap) to optimize the handling of growing raw data.
  • By activating this feature, the memory overhead associated with newly added or modified data will be significantly minimized.
  • However, this optimization may come at the cost of a slight decrease in query latency for the affected data segments.
  • false

    queryNode.mmap.fixedFileSizeForMmapAlloc

    Description Default Value
    tmp file size for mmap chunk manager 1

    queryNode.mmap.maxDiskUsagePercentageForMmapAlloc

    Description Default Value
    disk percentage used in mmap chunk manager 50

    queryNode.lazyload.enabled

    Description Default Value
    Enable lazyload for loading data false

    queryNode.lazyload.waitTimeout

    Description Default Value
    max wait timeout duration in milliseconds before start to do lazyload search and retrieve 30000

    queryNode.lazyload.requestResourceTimeout

    Description Default Value
    max timeout in milliseconds for waiting request resource for lazy load, 5s by default 5000

    queryNode.lazyload.requestResourceRetryInterval

    Description Default Value
    retry interval in milliseconds for waiting request resource for lazy load, 2s by default 2000

    queryNode.lazyload.maxRetryTimes

    Description Default Value
    max retry times for lazy load, 1 by default 1

    queryNode.lazyload.maxEvictPerRetry

    Description Default Value
    max evict count for lazy load, 1 by default 1

    queryNode.indexOffsetCacheEnabled

    Description Default Value
    enable index offset cache for some scalar indexes, now is just for bitmap index, enable this param can improve performance for retrieving raw data from index false

    queryNode.scheduler.maxReadConcurrentRatio

    Description Default Value
  • maxReadConcurrentRatio is the concurrency ratio of read task (search task and query task).
  • Max read concurrency would be the value of hardware.GetCPUNum * maxReadConcurrentRatio.
  • It defaults to 2.0, which means max read concurrency would be the value of hardware.GetCPUNum * 2.
  • Max read concurrency must greater than or equal to 1, and less than or equal to hardware.GetCPUNum * 100.
  • (0, 100]
  • 1

    queryNode.scheduler.cpuRatio

    Description Default Value
    ratio used to estimate read task cpu usage. 10

    queryNode.scheduler.scheduleReadPolicy.name

    Description Default Value
  • fifo: A FIFO queue support the schedule.
  • user-task-polling:
  • The user's tasks will be polled one by one and scheduled.
  • Scheduling is fair on task granularity.
  • The policy is based on the username for authentication.
  • And an empty username is considered the same user.
  • When there are no multi-users, the policy decay into FIFO"
  • fifo

    queryNode.scheduler.scheduleReadPolicy.taskQueueExpire

    Description Default Value
    Control how long (many seconds) that queue retains since queue is empty 60

    queryNode.scheduler.scheduleReadPolicy.enableCrossUserGrouping

    Description Default Value
    Enable Cross user grouping when using user-task-polling policy. (Disable it if user's task can not merge each other) false

    queryNode.scheduler.scheduleReadPolicy.maxPendingTaskPerUser

    Description Default Value
    Max pending task per user in scheduler 1024

    queryNode.levelZeroForwardPolicy

    Description Default Value
    delegator level zero deletion forward policy, possible option["FilterByBF", "RemoteLoad"] FilterByBF

    queryNode.streamingDeltaForwardPolicy

    Description Default Value
    delegator streaming deletion forward policy, possible option["FilterByBF", "Direct"] FilterByBF

    queryNode.dataSync.flowGraph.maxQueueLength

    Description Default Value
    The maximum size of task queue cache in flow graph in query node. 16

    queryNode.dataSync.flowGraph.maxParallelism

    Description Default Value
    Maximum number of tasks executed in parallel in the flowgraph 1024

    queryNode.enableSegmentPrune

    Description Default Value
    use partition stats to prune data in search/query on shard delegator false

    queryNode.queryStreamBatchSize

    Description Default Value
    return min batch size of stream query 4194304

    queryNode.queryStreamMaxBatchSize

    Description Default Value
    return max batch size of stream query 134217728

    queryNode.bloomFilterApplyParallelFactor

    Description Default Value
    parallel factor when to apply pk to bloom filter, default to 4*CPU_CORE_NUM 4

    queryNode.workerPooling.size

    Description Default Value
    the size for worker querynode client pool 10

    queryNode.ip

    Description Default Value
    TCP/IP address of queryNode. If not specified, use the first unicastable address

    queryNode.port

    Description Default Value
    TCP port of queryNode 21123

    queryNode.grpc.serverMaxSendSize

    Description Default Value
    The maximum size of each RPC request that the queryNode can send, unit: byte 536870912

    queryNode.grpc.serverMaxRecvSize

    Description Default Value
    The maximum size of each RPC request that the queryNode can receive, unit: byte 268435456

    queryNode.grpc.clientMaxSendSize

    Description Default Value
    The maximum size of each RPC request that the clients on queryNode can send, unit: byte 268435456

    queryNode.grpc.clientMaxRecvSize

    Description Default Value
    The maximum size of each RPC request that the clients on queryNode can receive, unit: byte 536870912
    Table of contents

    Try Managed Milvus for Free

    Zilliz Cloud is hassle-free, powered by Milvus and 10x faster.

    Get Started
    Feedback

    Was this page helpful?