Compressive video sensing is the process of encoding multiple sub-frames into a single frame with controlled sensor exposures and reconstructing the sub-frames from the single compressed frame. It is known that spatially and temporally random exposures provide the most balanced compression in terms of signal recovery. However, sensors that achieve a fully random exposure on each pixel cannot be easily realized in practice because the circuit of the sensor becomes complicated and incompatible with the sensitivity and resolution. Therefore, it is necessary to design an exposure pattern by considering the constraints enforced by hardware. In this paper, we propose a method of jointly optimizing the exposure patterns of compressive sensing and the reconstruction framework under hardware constraints. By conducting a simulation and actual experiments, we demonstrated that the proposed framework can reconstruct multiple sub-frame images with higher quality.