primary_data

Primary Data Server

Primary Data Server

I need a server to hold my primary (raw) data, that has not been subjected to processing or any other manipulation. I need this because i don't know what tool i will use in the future. At the moment i might use Splunk and Graylog2. But if i decide to use other tooling, it might be difficult to export the data. When the primary data is still available i can import that in the new tooling.

Starting points:

As less processing as possible. Store data as text files.
Syslog capable.
FTP capable.
scp/sftp capable.
rsync capable.
Redundant storage.
Archiving / compression?
Montoring / expected file age.
Back-up.
Silent bit rot detection

Hardware

ECC Memory

Configure storage

Incoming data will be stored locally on RAID10 redundant storage. It will then be synced over to a NFS share on a NAS for archiving and back-up. This way if the NAS suddenly isn't available, we will retry the rsync. If the NAS is available again the rsync will continue where it left off.

Configure backup

Configure a backup job on the NAS to remote storage.

Install the server

CentOS 6.5

Install receivers

Syslog

rsyslogd

FTP

vsftp

scp/sftp capable

ssh/sftp

rsync capable

rsync

Monitoring

Services: sshd, vsftp, rsync, rsyslogd
Filesystems
File age, specific logging
Bit rot detection, local and NAS share.

Table of Contents